Onboard basic sentiment analysis with defaults #350
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Adds a new sentiment analysis preset, with defaults for ingest/search pipelines and index configurations. Based roughly off of this documented example: https://opensearch.org/docs/latest/search-plugins/search-pipelines/ml-inference-search-request/#example-externally-hosted-model
This use case is intended to be used with a specialized sentiment analysis model (or LLM with tuned prompt) that takes in text and returns a sentiment/category (generally Positive/Neutral/Negative). One basic example is for storing and analyzing website reviews. This particular preset is two-fold:
label
field with the returned sentiment as part of the documentlabel
field's value in the request, such that only results with the matching sentiment are returned.Overall, this use case could be tuned and enhanced in many different ways. Users may want to persist more than just a label. For example, one reasonable use case is being able to perform a hybrid search over some text's vector, it's sentiment/label, and its plaintext, and try out different weights in a hybrid query, etc.
More details:
query.term.${text_field}.value
) for the ML models. This may be tuned later on and depends on the default queries or if the query editing experience changes.Demo video, showing a basic usecase with a sagemaker sentiment analysis model. Also shows the default values set in the ML search request processor for a vector search use case. Note that now by using all defaults, no further input is needed on this search request processor now.
screen-capture.14.webm
Check List
--signoff
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.