Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add existing indices as a source data option; improve config autofilling #346

Merged
merged 7 commits into from
Sep 5, 2024

Conversation

ohltyler
Copy link
Member

@ohltyler ohltyler commented Sep 4, 2024

Description

This PR adds a third option to specify the source data - from an existing index. When selected, x number of documents are automatically fetched and populated. Additionally, when data is fetched from an existing index (or from an uploaded document, attempt to prefill some config data. More specifically:

  • if index data is automatically populated, 1/ clear and update ML ingest processors (if applicable) with new text/vector fields (image is ignored for now, as we don't have a clear default pattern in ML processors for this use case. Additionally, would increase the level of fine-grained assumptions we already make)
  • if uploaded document data is automatically populated, clear some fields in the ML ingest processors (if applicable)

Additional improvements:

  • cleans up remaining stringify operations to use customStringify()
  • moves index fetching to the top-level WorkflowDetail page so it is optimized and only called once
  • moves remaining form components that should be compressed (EuiSuperSelect -> EuiCompressedSuperSelect)

Demo video, showing a workflow with some sample data, then creating another workflow, building a semantic search use case on top of that existing workflow's data using the new multi-option UI for data source selection. Also shows the underlying configs updated when data is populated via manual entry or file upload. Note this does not currently overwrite any default index mappings; users will still need to ensure this is up-to-date before creating.

screen-capture.13.webm

Issues Resolved

[List any issues this PR will resolve]

Check List

  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@ohltyler ohltyler marked this pull request as ready for review September 5, 2024 19:01
@saimedhi saimedhi merged commit 6408c43 into opensearch-project:main Sep 5, 2024
6 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Sep 5, 2024
…ing (#346)

* Add options; auto-populate sample docs on index selection

Signed-off-by: Tyler Ohlsen <[email protected]>

* onboard getmappings api; set default mappings and ml processor configs optionally

Signed-off-by: Tyler Ohlsen <[email protected]>

* clear out some config on manual or uploaded doc input changes

Signed-off-by: Tyler Ohlsen <[email protected]>

* change default index name for custom

Signed-off-by: Tyler Ohlsen <[email protected]>

* revert overwriting of index mappings

Signed-off-by: Tyler Ohlsen <[email protected]>

* revert vector field overriding

Signed-off-by: Tyler Ohlsen <[email protected]>

* Move vector search check

Signed-off-by: Tyler Ohlsen <[email protected]>

---------

Signed-off-by: Tyler Ohlsen <[email protected]>
(cherry picked from commit 6408c43)
ohltyler added a commit that referenced this pull request Sep 5, 2024
…ing (#346) (#347)


---------

Signed-off-by: Tyler Ohlsen <[email protected]>
(cherry picked from commit 6408c43)

Co-authored-by: Tyler Ohlsen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants