Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BWC tests neural search #515

Merged
merged 47 commits into from
Jan 2, 2024
Merged

Conversation

vibrantvarun
Copy link
Member

@vibrantvarun vibrantvarun commented Dec 7, 2023

Description

Neural search transforms text into vectors and facilitates vector search both at ingestion time and at search time. During ingestion, neural search transforms document text into vector embeddings and indexes both the text and its vector embeddings in a vector index. When you use a neural query during search, neural search converts the query text into vector embeddings, uses vector search to compare the query and document embeddings, and returns the closest results.
Before you ingest documents into an index, documents are passed through a machine learning (ML) model, which generates vector embeddings for the document fields. When you send a search request, the query text or image is also passed through the ML model, which generates the corresponding vector embeddings. Then neural search performs a vector search on the embeddings and returns matching documents.

BWC stands for Backward Compatibility.

Since, neural search feature is launched there are many features added to it. Therefore, there is a need to write BWC tests for the plugin to maintain compatibility with older versions.

This PR contains

  • Setting up architecture for BWC tests
  • Optimizing the location of Common files used in BWC tests and Integ test framework
  • Semantic search

Moreover, in this PR I have added work to format the entire project by apply ./gradlew spotlessApply to all files and remove empty new lines new lines added below. This will do the following

  • User cannot add a new file without license header
  • No wildcard imports
  • Remove existing whitespaces and extra lines from the project

Issues Resolved

202

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Copy link

codecov bot commented Dec 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (585fbbe) 84.33% compared to head (faa8dec) 84.33%.

Additional details and impacted files
@@            Coverage Diff            @@
##               main     #515   +/-   ##
=========================================
  Coverage     84.33%   84.33%           
  Complexity      533      533           
=========================================
  Files            40       40           
  Lines          1564     1564           
  Branches        244      244           
=========================================
  Hits           1319     1319           
  Misses          133      133           
  Partials        112      112           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
@vibrantvarun vibrantvarun changed the title BWC test neural search BWC tests neural search Dec 26, 2023
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
Signed-off-by: Varun Jain <[email protected]>
@vibrantvarun vibrantvarun marked this pull request as ready for review December 27, 2023 03:26
Copy link
Collaborator

@navneet1v navneet1v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few minor comments. Overall code looks good.

Great effort in adding the spotlessCheck this was long pending. Code looks in awesome shape now.

@vibrantvarun vibrantvarun reopened this Jan 2, 2024
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks!

@navneet1v navneet1v merged commit ff38622 into opensearch-project:main Jan 2, 2024
50 checks passed
@vibrantvarun vibrantvarun added the backport 2.x Label will add auto workflow to backport PR to 2.x branch label Jan 2, 2024
@vibrantvarun vibrantvarun self-assigned this Jan 2, 2024
@opensearch-trigger-bot
Copy link
Contributor

The backport to 2.x failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.x 2.x
# Navigate to the new working tree
cd .worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-515-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ff3862250ccdae41798fe0787d4872a9b07ffe2d
# Push it to GitHub
git push --set-upstream origin backport/backport-515-to-2.x
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.x

Then, create a pull request where the base branch is 2.x and the compare/head branch is backport/backport-515-to-2.x.

vibrantvarun added a commit to vibrantvarun/neural-search that referenced this pull request Jan 2, 2024
…ch-project#515)

* Reformatting test package

Signed-off-by: Varun Jain <[email protected]>

* Initial commit of BWC Test

Signed-off-by: Varun Jain <[email protected]>

* Text Search

Signed-off-by: Varun Jain <[email protected]>

* Fixing bug

Signed-off-by: Varun Jain <[email protected]>

* Text Search bug fix

Signed-off-by: Varun Jain <[email protected]>

* Adding windows platform in bwc

Signed-off-by: Varun Jain <[email protected]>

* Adding windows platform in bwc

Signed-off-by: Varun Jain <[email protected]>

* Rolling Upgrade tests

Signed-off-by: Varun Jain <[email protected]>

* Bux Fix in rolling upgrade

Signed-off-by: Varun Jain <[email protected]>

* Bug Fix Rolling Upgrade

Signed-off-by: Varun Jain <[email protected]>

* Fixing Flaky tests

Signed-off-by: Varun Jain <[email protected]>

* Updating BWC version to latest

Signed-off-by: Varun Jain <[email protected]>

* Fixing bwc test

Signed-off-by: Varun Jain <[email protected]>

* Semantic Search

Signed-off-by: Varun Jain <[email protected]>

* Bug Fix

Signed-off-by: Varun Jain <[email protected]>

* Debugging

Signed-off-by: Varun Jain <[email protected]>

* Bug Fix

Signed-off-by: Varun Jain <[email protected]>

* Increase memory in nodes

Signed-off-by: Varun Jain <[email protected]>

* Removing extra logging

Signed-off-by: Varun Jain <[email protected]>

* Cleaning up

Signed-off-by: Varun Jain <[email protected]>

* Updating Pipeline Configuration

Signed-off-by: Varun Jain <[email protected]>

* Remove KNN delete models

Signed-off-by: Varun Jain <[email protected]>

* Remove unecessary KNN code

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of naveen

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of naveen

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of Naveen

Signed-off-by: Varun Jain <[email protected]>

* Addressing martin comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of martin

Signed-off-by: Varun Jain <[email protected]>

* Apply formatting.xml in all lines

Signed-off-by: Varun Jain <[email protected]>

* Removing extra spacess from formatting.gradle

Signed-off-by: Varun Jain <[email protected]>

* Addressing martin comment

Signed-off-by: Varun Jain <[email protected]>

* Addressing Jack comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Jack comments

Signed-off-by: Varun Jain <[email protected]>

* Addressing Jack's comments

Signed-off-by: Varun Jain <[email protected]>

* Fixing Test cases

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of Navneet

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of Navneet

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of Navneet

Signed-off-by: Varun Jain <[email protected]>

* Addressing comments of Navneet

Signed-off-by: Varun Jain <[email protected]>

* Removing extra parameter from createPipelineProcessor

Signed-off-by: Varun Jain <[email protected]>

* Fixing bug

Signed-off-by: Varun Jain <[email protected]>

* Increasing number of shards

Signed-off-by: Varun Jain <[email protected]>

* Bug fix of load model id

Signed-off-by: Varun Jain <[email protected]>

* Changing names of tests to TextEmbeddingPRocessor

Signed-off-by: Varun Jain <[email protected]>

* Updating indexes and replicas

Signed-off-by: Varun Jain <[email protected]>

---------

Signed-off-by: Varun Jain <[email protected]>
martin-gaievski pushed a commit that referenced this pull request Jan 3, 2024
* Initial commit for adding BWC tests in neural search plugin (#515)

Signed-off-by: Varun Jain <[email protected]>
@vibrantvarun
Copy link
Member Author

vibrantvarun commented Jan 3, 2024

This PR also adds the feature request from #530

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Label will add auto workflow to backport PR to 2.x branch
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants