You need to enable JavaScript to run this app.
最新活动
大模型
产品
解决方案
定价
生态与合作
支持与服务
开发者
了解我们

如何在Lucene中查询一周内每日10:00-14:00的文档并适配Timelion

Absolutely! You can totally make this work with Lucene, and it plays nicely with Timelion too. Let me walk you through how to pull this off step by step.

1. Lucene Query Implementation

First, let's tackle the core Lucene query. Your goal is to fetch docs from the last week, but only those timestamped between 10:00 and 14:00 each day.

Assuming your time field is named timestamp (stored as a LongPoint for millisecond timestamps, or a DateField), here's how to build the query:

If you can add a precomputed hour_of_day integer field to your docs at index time (extracting the hour from the timestamp), the query is straightforward:

// Calculate the time window: last 7 days
long now = System.currentTimeMillis();
long oneWeekAgo = now - 7 * 24 * 60 * 60 * 1000L;

// Query for docs in the 7-day range
Query dateRangeQuery = LongPoint.newRangeQuery("timestamp", oneWeekAgo, now);

// Query for docs where hour is between 10 and 13 (covers 10:00 to 13:59:59, before 14:00)
Query hourFilter = NumericRangeQuery.newIntRange("hour_of_day", 10, 13, true, true);

// Combine both filters with a boolean query
BooleanQuery finalQuery = new BooleanQuery.Builder()
    .add(dateRangeQuery, BooleanClause.Occur.MUST)
    .add(hourFilter, BooleanClause.Occur.MUST)
    .build();

Option 2: Dynamic Hour Extraction (No Pre-stored Field)

If you can't add a new field, you can extract the hour directly from the timestamp at query time using a function query (note: this is less efficient for large datasets):

long now = System.currentTimeMillis();
long oneWeekAgo = now - 7 * 24 * 60 * 60 * 1000L;
Query dateRangeQuery = LongPoint.newRangeQuery("timestamp", oneWeekAgo, now);

// Use a script to extract the hour (adjust timezone to match your data!)
Query hourFilter = new FunctionScoreQuery(
    MatchAllDocsQuery.INSTANCE,
    DoubleFunctionSource.fromIntField("timestamp", val -> {
        Calendar cal = Calendar.getInstance(TimeZone.getTimeZone("UTC")); // Critical: match your data's timezone
        cal.setTimeInMillis(val);
        return cal.get(Calendar.HOUR_OF_DAY);
    }),
    DoubleValuesSource.fromDouble(val -> val >= 10 && val < 14 ? 1 : 0),
    FunctionScoreQuery.ScoreMode.FILTER
);

BooleanQuery finalQuery = new BooleanQuery.Builder()
    .add(dateRangeQuery, BooleanClause.Occur.MUST)
    .add(hourFilter, BooleanClause.Occur.MUST)
    .build();
2. Adapting to Timelion

Since Timelion sits on top of Elasticsearch (which uses Lucene under the hood), we just need to translate the Lucene logic into Elasticsearch Query DSL, then plug it into Timelion.

Step 1: Elasticsearch Query DSL

Here's the equivalent query for Elasticsearch (adjust the timezone and field names as needed):

{
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "timestamp": {
              "gte": "now-7d/d",
              "lte": "now/d"
            }
          }
        },
        {
          "script": {
            "script": {
              "source": "doc['timestamp'].value.withZoneSameInstant(ZoneId.of('UTC')).getHour() >= 10 && doc['timestamp'].value.withZoneSameInstant(ZoneId.of('UTC')).getHour() < 14",
              "lang": "painless"
            }
          }
        }
      ]
    }
  }
}

Step 2: Timelion Syntax

Plug this into Timelion using the .es() function. Remember to escape quotes properly:

.es(index=your_target_index, q='{"bool":{"must":[{"range":{"timestamp":{"gte":"now-7d/d","lte":"now/d"}}},{"script":{"script":{"source":"doc[\'timestamp\'].value.withZoneSameInstant(ZoneId.of(\'UTC\')).getHour() >=10 && doc[\'timestamp\'].value.withZoneSameInstant(ZoneId.of(\'UTC\')).getHour() <14","lang":"painless"}}}]}}')

If you have the pre-stored hour_of_day field, you can simplify the query to avoid scripting:

.es(index=your_target_index, q='{"bool":{"must":[{"range":{"timestamp":{"gte":"now-7d/d","lte":"now/d"}}},{"range":{"hour_of_day":{"gte":10,"lte":13}}}]}}')
3. Critical Things to Remember
  • Timezone Consistency: This is the #1 pitfall! Make sure the timezone you use in your query matches the timezone your timestamp field is stored in. If your data uses local time instead of UTC, adjust the script/field extraction accordingly.
  • Performance: Pre-storing the hour_of_day field will make your queries run much faster, especially on large datasets. Scripted queries work but add overhead.
  • Timelion Quoting: When pasting JSON into Timelion's q parameter, always escape inner quotes with \' to avoid syntax errors.

内容的提问来源于stack exchange,提问作者RockD

火山引擎 最新活动