Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reciprocal Rank Fusion (RRF) normalization technique in hybrid query #874

Open
wants to merge 5 commits into
base: feature/rrf-score-normalization-v2
Choose a base branch
from

Conversation

Johnsonisaacn
Copy link

@Johnsonisaacn Johnsonisaacn commented Aug 28, 2024

Description

Adding ability to process and combine scores from multiple subqueries in neural search using the reciprocal rank fusion (RRF) technique. Built with a new processor and processor factory class apart from NormalizationProcessor. Changes to API included in RFC. Does not currently support weights when combining processed subquery scores, based on lack of examples in existing literature.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
#865
#659

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Isaac Johnson added 4 commits August 16, 2024 12:33
Signed-off-by: Isaac Johnson <[email protected]>
Signed-off-by: Isaac Johnson <[email protected]>
Signed-off-by: Isaac Johnson <[email protected]>
@Johnsonisaacn Johnsonisaacn changed the title Rrf Implementing Reciprocal Rank Fusion (RRF) in Neural Search Aug 28, 2024
@Johnsonisaacn Johnsonisaacn marked this pull request as ready for review August 28, 2024 20:47
@vibrantvarun vibrantvarun changed the title Implementing Reciprocal Rank Fusion (RRF) in Neural Search Implementing Reciprocal Rank Fusion (RRF) Aug 28, 2024
@vibrantvarun vibrantvarun changed the title Implementing Reciprocal Rank Fusion (RRF) Reciprocal Rank Fusion (RRF) normalization technique in hybrid query Aug 28, 2024
Signed-off-by: Isaac Johnson <[email protected]>
@martin-gaievski
Copy link
Member

we should be merging to feature branch https://github.com/opensearch-project/neural-search/tree/feature/rrf-score-normalization, not main.

List<TopDocs> topDocsPerSubQuery = compoundQueryTopDocs.getTopDocs();
int numSubQueriesBound = topDocsPerSubQuery.size();
for (int index = 0; index < numSubQueriesBound; index++) {
int numDocsPerSubQueryBound = topDocsPerSubQuery.get(index).scoreDocs.length;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can have score docs array as a variable, should save few cpu cycles.

ScoreDoc[] scoreDocs = topDocsPerSubQuery.get(index).scoreDocs;

@NonNull
private ScoreCombinationTechnique combinationTechnique;
@Nullable
private int rankConstant;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the rank constant should be a parameter for RRF normalization technique class, not the high level field in this DTO. You can pass it as param to RRF normalization technique constructor while creating its instance in a factory class

@Johnsonisaacn Johnsonisaacn changed the base branch from main to feature/rrf-score-normalization September 4, 2024 22:24
@Johnsonisaacn Johnsonisaacn changed the base branch from feature/rrf-score-normalization to feature/rrf-score-normalization-v2 September 4, 2024 23:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants