Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collapsing search results new page added to documentation #7919

Merged
Merged
Changes from 2 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
23e0e89
adding documentation for collapsing search results
leanneeliatra Aug 6, 2024
3b21a0c
Merge branch 'main' into collapse-search-results-#7507
leanneeliatra Aug 7, 2024
53f5f3f
Merge branch 'main' into collapse-search-results-#7507
leanneeliatra Aug 15, 2024
3ab36b1
Clarifying collapsing of search results
leanneeliatra Aug 15, 2024
fa3ffbd
updating collapsing example as per request
leanneeliatra Aug 19, 2024
84bec0d
updates as per reviewdog
leanneeliatra Aug 20, 2024
18befbb
updates as per review dog
leanneeliatra Aug 20, 2024
d040c7d
Merge branch 'main' into collapse-search-results-#7507
leanneeliatra Aug 20, 2024
a5a1a5a
remove unneeded space
leanneeliatra Aug 20, 2024
b7a0f88
Merge branch 'main' into collapse-search-results-#7507
vagimeli Aug 28, 2024
d3b6fe6
Update _search-plugins/collapse-search.md
leanneeliatra Aug 30, 2024
bdedb11
Update _search-plugins/collapse-search.md
leanneeliatra Aug 30, 2024
856b6c6
Update _search-plugins/collapse-search.md
leanneeliatra Aug 30, 2024
ad12cd8
Update _search-plugins/collapse-search.md
leanneeliatra Aug 30, 2024
1d1e5ca
Apply suggestions from code review
leanneeliatra Aug 30, 2024
9942b78
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
587e3b6
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
6d1cc8f
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
df62330
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
9544dcd
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
c74e350
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
80c8b03
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
38a6cd8
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
c8cac89
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
0c68494
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
3df37b8
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
535f77a
Update _search-plugins/collapse-search.md
vagimeli Sep 12, 2024
b606e21
Update _search-plugins/collapse-search.md
leanneeliatra Sep 17, 2024
63a62e3
Merge branch 'main' into collapse-search-results-#7507
leanneeliatra Sep 17, 2024
5dbb1c6
Review suggestions addressed
leanneeliatra Sep 17, 2024
5535fcf
update to language
leanneeliatra Sep 17, 2024
4e6c712
update to language from review
leanneeliatra Sep 17, 2024
cf38f29
Merge branch 'main' into collapse-search-results-#7507
leanneeliatra Sep 18, 2024
224721d
Merge branch 'main' into collapse-search-results-#7507
vagimeli Sep 24, 2024
d2e728f
Merge branch 'main' into collapse-search-results-#7507
vagimeli Sep 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
332 changes: 332 additions & 0 deletions _search-plugins/collapse-search.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,332 @@
---
layout: default
title: Collapse search results
nav_order: 3
---

# Collapse search results

The collapse parameter in OpenSearch enables you to group search results by a particular field value, returning only the top document within each group. This feature is especially beneficial for reducing redundancy and enhancing performance when handling large datasets. By grouping these documents, you can streamline the search results and focus on unique entries, avoiding duplicates.
leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved

### Example of collapsing search results

leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved
To collapse search results by the `item` field and sort them by `price`, you can use the following query:
leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved

```json
GET /bakery-items/_search
{
"query": {
"match": {
"category": "cakes"
}
},
"collapse": {
"field": "item"
},
"sort": ["price"],
"from": 0
}

```

Collapsing only impacts the top hits and does not influence aggregations. The total number of hits in the response represents the count of matching documents before any collapsing is applied. The exact number of unique groups formed by collapsing is not provided. For collapsing to work, the field must be a single-valued `keyword` or `numeric` type with `doc_values` enabled.

### Expanding collapsed results

You can expand each collapsed top hit with the `inner_hits` property.

The following query will retrieve the top 5 items per bakery `item_name`, sorted by `price`.
``

```json
GET /bakery-items/_search
GET /bakery-items/_search
{
"query": {
"match": {
"category": "Pastry"
}
},
"collapse": {
"field": "item_name",
"inner_hits": {
"name": "top_items",
"size": 5,
"sort": [{ "price": "asc" }]
}
},
"sort": ["price"]
}

```

### Multiple Inner Hits for Each Collapsed Hit

Check failure on line 63 in _search-plugins/collapse-search.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.HeadingCapitalization] 'Multiple Inner Hits for Each Collapsed Hit' is a heading and should be in sentence case. Raw Output: {"message": "[OpenSearch.HeadingCapitalization] 'Multiple Inner Hits for Each Collapsed Hit' is a heading and should be in sentence case.", "location": {"path": "_search-plugins/collapse-search.md", "range": {"start": {"line": 63, "column": 5}}}, "severity": "ERROR"}

To obtain several groups of inner hits for each collapsed result, you can set different criteria for each group. For example, you could request the three least expensive items and the three most recent items for every bakery item.

```json
GET /bakery-items/_search
{
"query": {
"match": {
"category": "cakes"
}
},
"collapse": {
"field": "item_name",
"inner_hits": [
{
"name": "cheapest_items",
"size": 3,
"sort": ["price"]
},
{
"name": "newest_items",
"size": 3,
"sort": [{ "baked_date": "desc" }]
}
]
},
"sort": ["price"]
}

```
This query searches for documents in the `cakes` category and groups the search results by the `item_name` field. For each `item_name`, it retrieves the top 3 cheapest items and the top 3 most recent items, sorted by the `baked_date` in descending order.

The expansion of the group is done by sending an additional query for each inner hit request for each collapsed hit returned in the response. This can significantly slow down the process if there are too many groups and/or inner hit requests. The max_concurrent_group_searches request parameter can be used to control the maximum number of concurrent searches allowed in this phase. The default is based on the number of data nodes and the default search thread pool size.

### Second level of collapsing

Second level of collapsing is also supported and is applied to inner hits.


```json
GET /bakery-items/_search
{
"query": {
"match": {
"category": "cakes"
}
},
"collapse": {
"field": "baker",
"inner_hits": {
"name": "recent_items",
"size": 3,
"sort": [{ "baked_date": "desc" }]
}
}
}
```


response
```json
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 6,
"relation": "eq"
},
"max_score": 0.61310446,
"hits": [
{
"_index": "bakery-items",
"_id": "fSW9KJEB3prnZWCKY5wg",
"_score": 0.61310446,
"_source": {
"item_name": "Chocolate Cake",
"category": "cakes",
"price": 15,
"baked_date": "2023-07-01T00:00:00Z",
"baker": "Baker A"
},
"fields": {
"baker": [
"Baker A"
]
},
"inner_hits": {
"recent_items": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "bakery-items",
"_id": "gCW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Chocolate Cake",
"category": "cakes",
"price": 18,
"baked_date": "2023-07-04T00:00:00Z",
"baker": "Baker A"
},
"sort": [
1688428800000
]
},
{
"_index": "bakery-items",
"_id": "fSW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Chocolate Cake",
"category": "cakes",
"price": 15,
"baked_date": "2023-07-01T00:00:00Z",
"baker": "Baker A"
},
"sort": [
1688169600000
]
}
]
}
}
}
},
{
"_index": "bakery-items",
"_id": "fiW9KJEB3prnZWCKY5wg",
"_score": 0.61310446,
"_source": {
"item_name": "Vanilla Cake",
"category": "cakes",
"price": 12,
"baked_date": "2023-07-02T00:00:00Z",
"baker": "Baker B"
},
"fields": {
"baker": [
"Baker B"
]
},
"inner_hits": {
"recent_items": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "bakery-items",
"_id": "gSW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Vanilla Cake",
"category": "cakes",
"price": 14,
"baked_date": "2023-07-05T00:00:00Z",
"baker": "Baker B"
},
"sort": [
1688515200000
]
},
{
"_index": "bakery-items",
"_id": "fiW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Vanilla Cake",
"category": "cakes",
"price": 12,
"baked_date": "2023-07-02T00:00:00Z",
"baker": "Baker B"
},
"sort": [
1688256000000
]
}
]
}
}
}
},
{
"_index": "bakery-items",
"_id": "fyW9KJEB3prnZWCKY5wg",
"_score": 0.61310446,
"_source": {
"item_name": "Red Velvet Cake",
"category": "cakes",
"price": 20,
"baked_date": "2023-07-03T00:00:00Z",
"baker": "Baker C"
},
"fields": {
"baker": [
"Baker C"
]
},
"inner_hits": {
"recent_items": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "bakery-items",
"_id": "giW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Red Velvet Cake",
"category": "cakes",
"price": 22,
"baked_date": "2023-07-06T00:00:00Z",
"baker": "Baker C"
},
"sort": [
1688601600000
]
},
{
"_index": "bakery-items",
"_id": "fyW9KJEB3prnZWCKY5wg",
"_score": null,
"_source": {
"item_name": "Red Velvet Cake",
"category": "cakes",
"price": 20,
"baked_date": "2023-07-03T00:00:00Z",
"baker": "Baker C"
},
"sort": [
1688342400000
]
}
]
}
}
}
}
]
}
}
```

By using collapsing and inner hits effectively, you can manage large datasets in your bakery inventory, reduce redundancy, and focus on the most relevant information. This technique helps streamline search results, providing a clear and concise view of your data.


leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved




leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved
Loading