-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Retrieve image text pairs #14
base: main
Are you sure you want to change the base?
Conversation
…dates in jsonl format
Thank you for your contributions! I noticed that the InteractiveRetriever requires a pre-built candidate index file to function correctly. To assist users with this setup, could we consider adding a script, such as |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For the InteractiveRetriever, it seems that hardcoding for MSCOCO is in place. I think we should make it into a tool applicable to all tasks and datasets.
- We can add a run_interactive_retriever_pipeline.sh, that demonstrates the entire pipeline.
Please see the detailed comments for the review :)
src/common/interactive_retriever.py
Outdated
|
||
# MSCOCO's dataset id is hardcoded since the dataset id and query/candidate modalities determine the instruction part of the prompt. | ||
# MSCOCO's dataset supports prompt instructions for both image->text and text->image query->candidate modalities. | ||
self.dataset_id = 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the InteractiveRetriever specifically designed for the MSCOCO dataset? I observed that the self.dataset_id
and task_id
assignments appear to be hardcoded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the interactiveRetriever to be generic, but the way it is currently integrated with the mbeir_retriever is for retrieving complement candidates to create image text pairs and mscoco is a dataset that supports both text->image and image->text queries. Now the embeir retriever sets the dataset to mscoco for this task.
IMAGE = "image" | ||
|
||
|
||
class InteractiveRetriever: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed that the InteractiveRetriever requires a pre-built candidate index file to function correctly. To assist users with this setup, could we consider adding a script, such as run_interactive_retriever_pipeline.sh
, that demonstrates the entire pipeline? This script would cover embedding, indexing, and loading the index for the interactive retriever and retrieve demo queries. Additionally, incorporating a step-by-step guide in the README could greatly enhance the user experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, I created unirag
folder next to inbatch
for BLIP_FF Large and CLIP_SF Large. It has embed, index, and retrieval configs and the run script as your requested.
@lim142857 I applied the requested changes, PTAL |
[3/3] Retrieve complement candidates when enabled in config. The complement candidate of an original retrieved candidate has a complement modality so that candidate and complement candidate are always image-text pairs.
The complement candidate for each original candidate is retrieved by using the original candidate as an interactive query.
The complement candidate is the top most relevant candidate with complement modality but different from the original query.