Add a prototype tab; support ingest and search with guardrails #139

ohltyler · 2024-04-19T03:30:30Z

Description

This PR adds a Prototype tab that can be used for testing a workflow end-to-end. Note the placement of executing this in UX is TBD, so I've placed in its own sandboxed tab for now for ease of use. It dynamically parses out the necessary fields to generate both 1/ a valid document schema to ingest to the configured index, and 2/ a valid query to run some variation of neural search against the configured index, all based on the workflow's use case. For example, a SEMANTIC_SEARCH use case will have the index mapping configured with a specific set of fields (input field, vector field, model ID), defining the document schema, and a particular way to execute a search with the neural query clause utilizing those fields. This will all be different than other neural search variations, that will have different schemas and a different construction of valid queries to execute.

In particular (current scope is only semantic search), we have UI guardrails to make everything readonly besides:

the value of the plaintext field used when ingesting
the value of the plaintext field used when querying
The rest of the construction is all generated based on the use case (for now only semantic search), and the values of the configured fields (input field, vector field, model ID, index name).

Note this is just a first start, we can easily expose more or less depending on how much flexibility is desired (e.g., freeform queries, opening up editing of k or other neural clause params, etc.)

Implementation details:

added Ingestor component to format, execute, and display response from running an indexing command against an index
added QueryExecutor component to format, execute, and display a formatted list of results from running a query against an index
added base Prototype component to handle global prototype state and tab state
added data_extractor_utils to organize all utility fns used for parsing the workflow data to get different fields based on the workflow use case. this will be expanded upon and refactored as more use cases are added
onboarded search index and index document APIs
minor handling of empty state in prototype tab when the workflow has not been provisioned and/or no resources are available

Demo video:

creating a semantic search workflow and defining all of the configured fields
provisioning to set up the index and ingest pipeline
using Ingest tab for ingesting some sample documents, only requiring plaintext input
using Query tab for searching against those documents, only requiring plaintext input
error handling on empty search
basic validation of semantic search working as expected

screen-capture.31.webm

Issues Resolved

Makes progress on #68

Check List

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Tyler Ohlsen <[email protected]>

amitgalitz · 2024-04-19T16:19:44Z

Looks awesome, quick question, if provisioning takes a long time, should we have some refresh capability or it will auto query until provisioning is done?

ohltyler · 2024-04-19T16:28:22Z

Looks awesome, quick question, if provisioning takes a long time, should we have some refresh capability or it will auto query until provisioning is done?

Yep, currently it's very basic, only does one refresh after 5s, no auto refresh. My idea to improve will be logic in the state rendering, and if it's in a transitive state (provisioning/deprovisioning), perform auto-refresh on some 1 or 2s interval or something. Not that fine-grained yet though

amitgalitz · 2024-04-19T16:32:46Z

Looks awesome, quick question, if provisioning takes a long time, should we have some refresh capability or it will auto query until provisioning is done?

Yep, currently it's very basic, only does one refresh after 5s, no auto refresh. My idea to improve will be logic in the state rendering, and if it's in a transitive state (provisioning/deprovisioning), perform auto-refresh on some 1 or 2s interval or something. Not that fine-grained yet though

Sounds good, even a refresh button later on if some jobs take a while like reindex or deploying a large pre trained model, we have refresh buttons all over core OSD too

ohltyler · 2024-04-19T16:35:10Z

Looks awesome, quick question, if provisioning takes a long time, should we have some refresh capability or it will auto query until provisioning is done?

Yep, currently it's very basic, only does one refresh after 5s, no auto refresh. My idea to improve will be logic in the state rendering, and if it's in a transitive state (provisioning/deprovisioning), perform auto-refresh on some 1 or 2s interval or something. Not that fine-grained yet though

Sounds good, even a refresh button later on if some jobs take a while like reindex or deploying a large pre trained model, we have refresh buttons all over core OSD too

Agreed. I was thinking similar. Same with on the base workflow list page, can be helpful. I've already seen on my local single-node cluster how the pretrained models can take up a good 10-15s to finish getting deployed locally.

Signed-off-by: Tyler Ohlsen <[email protected]> (cherry picked from commit e18bb11)

…#140) Signed-off-by: Tyler Ohlsen <[email protected]> (cherry picked from commit e18bb11) Co-authored-by: Tyler Ohlsen <[email protected]>

owaiskazi19 · 2024-04-19T16:54:58Z

@ohltyler are we not adding tests for now?

ohltyler · 2024-04-19T17:01:41Z

@ohltyler are we not adding tests for now?

No. Not until there's even an idea of a final proposed design, it's not worth adding as the code is too fluid

ohltyler added 7 commits April 18, 2024 18:11

Add prototype page skeleton; add semantic search query generation

bdbb9b8

Signed-off-by: Tyler Ohlsen <[email protected]>

Fetch configured index

9669a0a

Signed-off-by: Tyler Ohlsen <[email protected]>

add formatting on index side

b37be0d

Signed-off-by: Tyler Ohlsen <[email protected]>

Onboard search API; onboard query execution on UI

37e45f1

Signed-off-by: Tyler Ohlsen <[email protected]>

Add tabs; support same functionality for ingest;

41e3d61

Signed-off-by: Tyler Ohlsen <[email protected]>

minor format change

ecb2d5e

Signed-off-by: Tyler Ohlsen <[email protected]>

minor formatting

8571000

Signed-off-by: Tyler Ohlsen <[email protected]>

ohltyler requested review from dbwiddis, owaiskazi19, joshpalis, amitgalitz and jackiehanyang as code owners April 19, 2024 03:30

opensearch-trigger-bot bot added the backport 2.x label Apr 19, 2024

ohltyler added rapid workflow prototyping labels Apr 19, 2024

amitgalitz approved these changes Apr 19, 2024

View reviewed changes

ohltyler merged commit e18bb11 into opensearch-project:main Apr 19, 2024
10 checks passed

ohltyler deleted the sparse-search branch April 19, 2024 16:35

opensearch-trigger-bot bot pushed a commit that referenced this pull request Apr 19, 2024

Add a prototype tab; support ingest and search with guardrails (#139)

371e0dc

Signed-off-by: Tyler Ohlsen <[email protected]> (cherry picked from commit e18bb11)

opensearch-trigger-bot bot mentioned this pull request Apr 19, 2024

[Backport 2.x] Add a prototype tab; support ingest and search with guardrails #140

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a prototype tab; support ingest and search with guardrails #139

Add a prototype tab; support ingest and search with guardrails #139

ohltyler commented Apr 19, 2024 •

edited

Loading

amitgalitz commented Apr 19, 2024

ohltyler commented Apr 19, 2024

amitgalitz commented Apr 19, 2024

ohltyler commented Apr 19, 2024

owaiskazi19 commented Apr 19, 2024

ohltyler commented Apr 19, 2024

Add a prototype tab; support ingest and search with guardrails #139

Add a prototype tab; support ingest and search with guardrails #139

Conversation

ohltyler commented Apr 19, 2024 • edited Loading

Description

Issues Resolved

Check List

amitgalitz commented Apr 19, 2024

ohltyler commented Apr 19, 2024

amitgalitz commented Apr 19, 2024

ohltyler commented Apr 19, 2024

owaiskazi19 commented Apr 19, 2024

ohltyler commented Apr 19, 2024

ohltyler commented Apr 19, 2024 •

edited

Loading