Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering search results documentation addition #7667

Closed
131 changes: 131 additions & 0 deletions _search-plugins/searching-data/filtering-search-results.md
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One key missing this here is that DQL is a filtering language for OpenSearch Dashboards. I would not mention anything about Dev Tools here since all fintering today can be done without ever needing to go there. There are 3 main ways to filter in OSD.

  1. DQL or Lucene in the query bar
  2. The Filter button which provides both a form to create a new filter and an advanced view to enter Query DSL directly
  3. The timerange picker

Its also worth calling out what you mean by search here since thats discover

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Ashwin. I am addressing your comments and adding in you suggestions. My question would be in relation to 'Its also worth calling out what you mean by search here since thats discover', are you referring to a specific part here?
Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwin-pc @leanneeliatra We already have this documentation in https://opensearch.org/docs/latest/dashboards/discover/index-discover/. Is there a reason for this PR? If we need more comprehensive information about Query DSL using the query bar in Discover, then we should add to the existing doc. If I'm misunderstanding the purpose of this doc, then please clarify what the purpose of this doc is. Thank you.

cc: @hdhalter

Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
layout: default
title: Filter results
parent: Searching data
nav_order: 21
redirect_from:
leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
- /opensearch/search/filter/
vagimeli marked this conversation as resolved.
Show resolved Hide resolved
---
# Filtering search results in OpenSearch
leanneeliatra marked this conversation as resolved.
Show resolved Hide resolved

Filtering the search results in OpenSearch allows you to refine the results returned. Specifying criteria to filter the indexes/documents by, allows you to refine the returned results. You can filter the results based on ranges, conditions or specific terms to refine the returned results. This is helpful when you have large datasets and allows you to interpret and understand larger datasets more easily.

Filtering allows you to:
- Improve search accuracy by reducing unneeded information on a case by case basis to allow accurate interpretation of results.
- Enhance performance by reducing the amount of data necessary to process and return when querying. Improving query performance.
- Categorised search by allowing data to be returned in categories which allows you to explore the data in a structured manner.

## Filtering with OpenSearch Dashboards

To begin filtering in OpenSearch Dashboards, follow these steps:

1. Navigate to the OpenSearch Dashboards UI.
2. Click on `Discover` in the sidebar.
3. Choose `opensearch_dashboards_sample_data_flights` from the index pattern selector.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be more generic and also point them to the page on how to create an index patter. A user might not have installed the flights sample dataset. By default they dont have any sample datasets installed and even if they do manually add them, they will not add all.

Copy link
Collaborator

@vagimeli vagimeli Jul 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See additions at line 16


### Example: Filter by destination airport
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshots will be really useful here

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added


To filter flights arriving at "Zurich Airport":

**Add a filter:**
- Click on the `Add filter` button.
- In the filter field, select `Dest`.
- In the operator field, select `is`.
- In the value field, type `Zurich Airport`.
- Click `Save`.

This will display only the flights arriving at Zurich Airport.

### Example: Filter by flight delay

To filter flights that have been cancelled in the last 100 days:
- Click on the timeframe to update the time selection.

Check failure on line 42 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: timeframe. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: timeframe. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 42, "column": 16}}}, "severity": "ERROR"}

Check failure on line 42 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.SubstitutionsError] Use 'time frame' instead of 'timeframe'. Raw Output: {"message": "[OpenSearch.SubstitutionsError] Use 'time frame' instead of 'timeframe'.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 42, "column": 16}}}, "severity": "ERROR"}
- Under the 'Relative' time tab.
- Change the unit to '1'.
- From the dropdown select 'Days ago'.
- The data shown is now from the Last 100 days.

**Add a filter:**
- Click on the `Add filter` button.
- In the filter field, select `Cancelled`.
- In the operator field, select `is`.
- In the value field, type `true`.
- Click `Save`.

This will display only the flights that have been cancelled with a destination of Zurich Airport in the last 100 days.

## Using Query DSL for advanced filtering
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can actually do this in the query bar too. See the Edit as Query DSL below.
Screenshot 2024-07-11 at 10 26 50 AM

Copy link
Collaborator

@vagimeli vagimeli Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dev Tools capabilities with examples are in https://opensearch.org/docs/latest/dashboards/dev-tools/index-dev/. @ashwin-pc Do you have a demo that I can use to focus on using the query bar?


OpenSearch Query DSL (Domain Specific Language) allows for more complex and powerful queries. You can combine multiple conditions and use advanced logic to filter data.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.


### Example of Query DSL filtering
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove this section. There already is info about filtering here https://opensearch.org/docs/latest/query-dsl/query-filter-context/#filter-context

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revised to address @ashwin-pc comment re: option to use the Discover application

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thank you @vagimeli


To use Query DSL, you need to define your queries in JSON format.

To begin filtering via DSL in OpenSearch Dashboards, follow these steps:

Check warning on line 65 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'. Raw Output: {"message": "[OpenSearch.LatinismsSubstitution] Use 'using, through, by accessing, or by choosing' instead of 'via'.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 65, "column": 20}}}, "severity": "WARNING"}

1. Navigate to the OpenSearch Dashboards UI.
2. Click on `Dev Tools` in the sidebar.
3. Write your DSL query in the dev tools left window.

Check failure on line 69 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 69, "column": 32}}}, "severity": "ERROR"}

Check failure on line 69 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Dev Tools' instead of 'dev tools'. Raw Output: {"message": "[Vale.Terms] Use 'Dev Tools' instead of 'dev tools'.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 69, "column": 32}}}, "severity": "ERROR"}
4. Highlight the query and click the play button to run the query.
5. The answer is outputted in the right half of the dev tools window.

Check failure on line 71 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: dev. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 71, "column": 53}}}, "severity": "ERROR"}

Check failure on line 71 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Dev Tools' instead of 'dev tools'. Raw Output: {"message": "[Vale.Terms] Use 'Dev Tools' instead of 'dev tools'.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 71, "column": 53}}}, "severity": "ERROR"}

### Example: Filter flights with delay greater than 60 minutes

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"range": {
"FlightDelayMin": {
"gt": 60
}
}
}
}
```

This DSL query will retreive the instances where the flight delay is greater than 60 minutes.

Check failure on line 88 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [OpenSearch.Spelling] Error: retreive. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks. Raw Output: {"message": "[OpenSearch.Spelling] Error: retreive. If you are referencing a setting, variable, format, function, or repository, surround it with tic marks.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 88, "column": 21}}}, "severity": "ERROR"}

### Example: Combined filter with Query DSL

To filter flights operated by "Logstash Airways", with an average ticket price (AvgTicketPrice) between 0 and 1000, and destination country (DestCountry) as Italy, you can use the following DSL query:

```json
GET opensearch_dashboards_sample_data_flights/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"Carrier": "Logstash Airways"
}
},
{
"range": {
"AvgTicketPrice": {
"gte": 0,
"lte": 1000
}
}
},
{
"term": {
"DestCountry": "IT"
}
}
]
}
}
}
```

This query uses a boolean must clause to combine three conditions:

Check failure on line 124 in _search-plugins/searching-data/filtering-search-results.md

View workflow job for this annotation

GitHub Actions / style-job

[vale] reported by reviewdog 🐶 [Vale.Terms] Use 'Boolean' instead of 'boolean'. Raw Output: {"message": "[Vale.Terms] Use 'Boolean' instead of 'boolean'.", "location": {"path": "_search-plugins/searching-data/filtering-search-results.md", "range": {"start": {"line": 124, "column": 19}}}, "severity": "ERROR"}

1. The carrier is Logstash Airways.
2. The average price ticket is between 0 and 1000 dollars.
3. The destination country is Italy.

By following these steps, you can filter and examine large data sets with ease, based off the relevant queries and criteria for your investigations.

Loading