From ea7a2e2845b3cca3c1d7caeb14db091ed2152ee5 Mon Sep 17 00:00:00 2001 From: Paul Masurel Date: Fri, 11 Aug 2023 19:40:22 +0900 Subject: [PATCH] Scroll and exists query (#3724) * Added docs for scroll and exists query. Closes #3715 Closes #3713 * Apply suggestions from code review Co-authored-by: Igor Motov * Code review comments --------- Co-authored-by: Igor Motov --- docs/configuration/index-config.md | 2 + docs/reference/es_compatible_api.md | 88 ++++++++++++++++++++++++++++- 2 files changed, 88 insertions(+), 2 deletions(-) diff --git a/docs/configuration/index-config.md b/docs/configuration/index-config.md index 4dc3b62c847..8b9ac545c86 100644 --- a/docs/configuration/index-config.md +++ b/docs/configuration/index-config.md @@ -53,6 +53,7 @@ doc_mapping: tokenizer: raw tag_fields: ["resource.service"] timestamp_field: timestamp + index_field_presence: true search_settings: default_search_fields: [severity_text, body] @@ -93,6 +94,7 @@ The doc mapping defines how a document and the fields it contains are stored and | `timestamp_field` | Timestamp field* used for sharding documents in splits. The field has to be of type `datetime`. [Learn more about time sharding](./../overview/architecture.md). | `None` | `partition_key` | If set, quickwit will route documents into different splits depending on the field name declared as the `partition_key`. | `null` | | `max_num_partitions` | Limits the number of splits created through partitioning. (See [Partitioning](../overview/concepts/querying.md#partitioning)) | `200` | +| `index_field_presence` | Enabling index field presence is required to allow for exists queries. Enabling it can have a significant CPU-cost on indexing. | false | *: tags fields and timestamp field are expressed as a path from the root of the JSON object to the given field. If a field name contains a `.` character, it needs to be escaped with a `\` character. diff --git a/docs/reference/es_compatible_api.md b/docs/reference/es_compatible_api.md index 5e6c03945b2..570eba7ff3c 100644 --- a/docs/reference/es_compatible_api.md +++ b/docs/reference/es_compatible_api.md @@ -121,7 +121,8 @@ If a parameter appears both as a query string parameter and in the JSON payload, | `from` | `Integer` | The rank of the first hit to return. This is useful for pagination. | 0 | | `q` | `String` | The search query. | (Optional) | | `size` | `Integer` | Number of hits to return. | 10 | -| `sort` | `String` | (Optional) | +| `sort` | `String` | Describes how documents should be ranked. See [Sort order](#sort-order) | `[]` | (Optional) | +| `scroll` | `Duration` | Creates a scroll context for "time to live". See [Scroll](#_scroll--scroll-api). | (Optional) #### Supported Request Body parameters @@ -131,10 +132,47 @@ If a parameter appears both as a query string parameter and in the JSON payload, | `from` | `Integer` | The rank of the first hit to return. This is useful for pagination. | 0 | | `query` | `Json object` | Describe the search query. See [Query DSL](#query-dsl) | (Optional) | | `size` | `Integer` | Number of hits to return. | 10 | -| `sort` | `JsonObject[]` | Describes how documents should be ranked. | `[]` | +| `sort` | `JsonObject[]` | Describes how documents should be ranked. See [Sort order](#sort-order) | `[]` | | `aggs` | `Json object` | Aggregation definition. See [Aggregations](aggregation.md). | `{}` | ` +#### Sort order + +You can define up to two criteria on which to apply sort. +The second criterion will only be used in presence of a tie for the first criterion. + +A given criterion can either be +- the name of a fast field (explicitly defined in the schema or captured by the dynamic mode) +- `_score` to sort by BM25. + +By default, the sort order is `ascending` for fast fields and descending for `_score`. + +When sorting by a fast field and this field contains several values in a single document, only the first value is used for sorting. + +The sort order can be set as descending/ascending using the +following syntax. + +```json +{ + // ... + "sort" : [ + { "timestamp" : {"order" : "asc"}}, + { "serial_number" : "desc" } + ] + // ... +} + +``` + +It is also possible to not supply an order and rely on the default order using the following syntax. + +```json +{ //... + "sort" : ["_score", "timestamp"] + // ... +} +``` + ### `_msearch`   Multi search API ``` @@ -158,6 +196,31 @@ The payload is expected to alternate: - a `header` json object, containing the targetted index id. - a `search request body` as defined in the [`_search` endpoint section]. + +### `_search/scroll`   Scroll API + +``` +GET api/v1/_elastic/_search/scroll +``` + +#### Supported Request Body parameters + +| Variable | Type | Description | Default value | +|---------------|------------|------------------------------------------------------------------|---------------| +| `scroll_id` | Scroll id (obtained from a search response) | Required + + +The `_search/scroll` endpoint, in combination with the `_search` API makes it possible to request successive pages of search results. +First, the client needs to call the `search api` with a `scroll` query parameter, and then pass the `scroll_id` returned in the response payload to `_search/scroll` endpoint. + +Each subsequent call to the `_search/scroll` endpoint will return a new `scroll_id` pointing to the next page. + +:::caution + +The scroll API should not be used to fetch above the 10,000th result. + +::: + ## Query DSL [Elasticsearch Query DSL reference](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl.html). @@ -346,4 +409,25 @@ The following query types are supported. ``` +### `exists` + +[Elasticsearch reference documentation](https://www.elastic.co/guide/en/elasticsearch/reference/8.8/query-dsl-exists-query.html) + +Query matching only documents containing a non-null value for a given field. + +#### Example + +```json +{ + "exists": { + "field": "author.login" + } +} +``` + +#### Supported Parameters + +| Variable | Type | Description | Default | +|-------------------|------------|------------------------------------------------------------------|---------| +| `field` | String | Only documents with a value for field will be returned. | - |