-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving Search relevancy through Generic Second stage reranker #248
Comments
@navneet1v -- Let's move this to the neural-search repo |
@opensearch-project/admin I think we need an OpenSearch maintainer or admin to move this to https://github.com/opensearch-project/neural-search |
@aliasneo1 thanks for opening this github issue. This seems very interesting. I have some couple of questions related to this:
|
Waiting eagerly for this feature! Do we have an estimated release date for it? I believe this is an important use case for search |
@navneet1v @ylwu-amzn What do you guys think about adding support for this in the RAG pipeline? This is a perfect use case for RAG. We can add a search response processor (similar to the Kendra re-ranker) that makes use of cross encoders. What are some good candidates for pre-trained models to bring into ml-commons? Any suggestions? |
what do you mean by this?
I think there has been some thoughts given to this, but if you want to start coming up with a proposal for this feel free to do that. As per my understanding we want to build this feature in Neural search. So if you are interested feel free to create a proposal for that.
No idea around this. This needs to be researched. |
A desired solution was already stated above.
Since KNN does the first part, we need a search processor that does the second part (re-ranking). We have a reranker that uses Amazon Kendra. What I had in mind is a reranker that uses a cross encoder which executes in a search response processor. I'm hoping that this can help improve the results that the RAG processor feeds to LLMs. So, in this use case, the reranker would run after the hybrid search processor runs. ml-commons might be a better place for this since cross encoders can run on BM25 (or sparse vector) results (independent of semantic search/KNN). |
@austintlee in terms of improving Search relevance we are targeting to put features in Neural Search Plugin, or we should put Re-ranking basic interface in core. ML Commons can provide the model which will do the re-ranking but in terms of providing the interface to do re-ranking ML Commons is not a right option. Its closely related to Search. |
I don't think of ml-commons as just a model serving layer, although I think we do want to use it for serving cross encoders. I mainly want to see this functionality come to life. Even if we put this in neural-search, it's still going to be a search processor, right? |
Actually it is. Reason why I was thinking to build the re-ranking feature outside of ML commons so that users can write their own re-rankers which can use models not served by ML Commons. So here is what I was thinking:
|
Yes initial thought is around that only. But I was stepping up feature here and see if we can do better. Plus there is one big question that we need to ans before we start working is, does re-ranking using Cross Encoder improve Search relevance, if yes by how much and if we compare it with techniques like Normalization and Score Combination what is the diff. |
@austintlee I have added this in the VectorDB hot backlog. @vamshin lets see if we can prioritize this. |
I know this is an important question and we want to get at least a rough idea of how much, if any, improvement this will get us before we take this on, but at the same time, there clearly seems to be appetite from the community in making this feature available so people can experiment on their own data. |
I'm super interested in the vector db roadmap. Can we have a public meeting where we can discuss hot topics, what's coming, etc? |
@austintlee FYI this is the public roadmap: https://github.com/orgs/opensearch-project/projects/145 |
@navneet1v curious, are you saying that you see re-ranking and normalization as mutually exclusive? or maybe I misunderstood? |
That's how I am approaching this. Another search response processor. Maybe there is a reranking processor that is tailored to re-ranking, but it would just be an extension of a search processor. |
@navneet1v Some metrics from experiments using a reranker in python-land on a customer dataset:
These experiments were run using hybrid search (min-max norm, linear combo, weighted (0.111, 0.889) towards neural over bm25) with embedding model "thenlper/gte-small" and reranker "BAAI/bge-reranker-large", reranking the top 100 documents. There are about 1000 documents in the index, and 19 question, docs pairs (each of the 19 questions matches to about 1-3 relevant documents) |
Can we target this for 2.12? |
@navneet1v @vamshin Is someone working on this? If not, I'll take it |
@HenryL27 thanks for your interest. We are looking into this. We will need your support with RFC/Code reviews. |
@vamshin so yes, someone is working on this? What's the timeline? Any sub-issues I can pick up? I don't see any issues yet. |
@HenryL27 we started scoping this work and using this GitHub issue as a feature request. As a first step, we will do RFC to get community feedback on the approaches. Our idea is to build a more generic reranker(multi-stage) capable of passing metadata(user context) and OpenSearch results to a remote connector for reranking results. Let me get back on the timelines. Happy to collaborate. |
@vamshin We have a customer who wants this now - can we scope this down to a simple 1-stage reranker and then expand it later? Here's the API spec I have in mind APIsCreate Rerank Pipeline PUT /_search/pipeline
{
"response_processors": [
{
"neural-rerank": {
"top_k": int (how many to rerank),
"model_id": id of cross-encoder,
"context_field": str (source field to compare to query)
}
}
]
} Query Rerank Pipeline POST index/_search
{
"query": {...},
"ext": {
"rerank": {
"query_text": str (query text to compare)
}
}
} or alternatively POST index/_search
{
"query": {...}
"ext": {
"rerank": {
"query_text_path": str (path in the search body to the query text)
}
}
} For example, with a neural query we might have
The rerank processor will evaluate the Upload Cross Encoder Model POST /_plugins/_ml/models/_upload
{
"name": "model name",
"version": "1.0.0 or something",
"description": "description",
"model_format": "TORCH_SCRIPT",
"function_name": "TEXT_SIMILARITY",
"model_content_hash_value": "hash browns",
"url": "https://url-of-model"
} This is not a new API or anything, and all the other model-based APIs should still work for the cross encoder model/function name with minimal work to integrate. Basically, a simple search response processor that a user can plug in to whatever search request they have, that looks a lot like how a neural search works so should be familiar. That would solve probably 90% of use cases. |
All, I'm catching up on this thread. This is on our product roadmap. We're going to implement a generic second stage re-ranking search pipeline. You will be able to integrate a second-stage re-ranker like a cross-encoder via the AI connectors available in ml-commons. This functionality will be integrated with the neural search and LTR experience. I have a forum post a while back to collect community feedback. |
@dylan-tong-aws Henry is essentially signing up to do the work described in that post. |
@HenryL27 please feel free to publish RFC and we can let community provide the feedback. Thanks |
See #485 |
@HenryL27 @navneet1v Is it ok to change the title to "Generic Second stage reranker to Improve search relevancy"? |
I don't see any concern. Let me change this, |
Closing this issue as its released in 2.12.0 |
Issue Description:
We are currently utilizing a neural retriever based on the bi-encoder vector search method. However, it has come to our attention that the performance of the bi-encoder approach is suboptimal when compared to the cross-encoder method, as highlighted in the referenced research paper (link).
Desired Solution:
We propose the integration of both Cross-Encoder and Bi-Encoder methods to enhance retrieval performance, particularly in scenarios involving large datasets. Cross-Encoders demonstrate superior performance, but they encounter scalability challenges with extensive datasets. To address this, a hybrid approach can be employed in scenarios like Information Retrieval and Semantic Search. Here's the suggested process:
Initiate retrieval using an efficient Bi-Encoder to identify the top 100 most similar sentences for a given query.
Subsequently, employ a Cross-Encoder to re-rank the initial 100 matches. This involves computing scores for each (query, hit) pairing.
By incorporating a Cross-Encoder-based re-ranker after the initial retrieval, a notable enhancement in final results for users can be achieved.
Considered Alternatives:
We have evaluated several alternative solutions and features in pursuit of improved retrieval performance. However, none have proven as effective as the combined Cross-Encoder and Bi-Encoder approach proposed above.
Additional Context:
For a more comprehensive understanding, any supplementary context or relevant screenshots related to this feature request will be provided as necessary. Your consideration of this enhancement would be greatly appreciated.
The text was updated successfully, but these errors were encountered: