Skip to content

Latest commit

 

History

History
111 lines (90 loc) · 3.96 KB

README.md

File metadata and controls

111 lines (90 loc) · 3.96 KB

UniIR

This repo is under construction. Please stay tuned.

🌐 Homepage | 🤗 Dataset | 📖 arXiv | GitHub

This repo contains the codebase for the paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers"

🔔News

  • [2024-01-21]: Refactor Codebase and Release the Preprocessing Scripts for all the datasets.
  • 🔥[2023-12-21]: Our M-BEIR Benchmark is now available for use.

Introduction

We propose the UniIR(Universal multimodal Information Retrieval) framework to learn a single retriever to accomplish (possibly) any retrieval task. Unlike traditional IR systems, UniIR needs to follow the instructions to take a heterogeneous query to retrieve from a heterogeneous candidate pool with millions of candidates in diverse modalities.

UniIR Teaser

Content

  1. M-BEIR
  2. Training
  3. Evaluation
  4. Model Zoo
  5. Citations and Contact

M-BEIR

To train and evaluate universal multimodal retrieval models, we build a large-scale retrieval benchmark named M-BEIR (Multimodal BEnchmark for Instructed Retrieval).

M-BEIR Downloading

We provide the M-BEIR dataset in the 🤗 Dataset. Please follow the instructions to download the dataset and prepare the data for training and evaluation.

UniIR Models

We provide the codebase for training and evaluating the UniIR CLIP-ScoreFusion, CLIP-FeatureFusion, BLIP-ScoreFusion, and BLIP-FeatureFusion models.

Training

To train the UniIR models from pretrained CLIP and BLIP checkpoints, please follow the instructions below. The scripts will automatically download the pretrained checkpoints.

1. Environment

UniIR CLIP_SF and CLIP_FF

# From the root directory of the repo
cd src/models/uniir_clip/
conda env create -f clip_env.yml

UniIR BLIP_SF and BLIP_FF

cd src/models/uniir_blip/
conda env create -f blip_env.yml

2. Scripts

UniIR CLIP_SF

cd src/models/uniir_clip/clip_scorefusion/configs_scripts/large/train/inbatch/

Modify inbatch.yaml for hyperparameter tuning and run_inbatch.sh for your own environment and paths.

bash run_inbatch.sh

UniIR BLIP_FF

cd src/models/uniir_blip/blip_featurefusion/configs_scripts/large/train/inbatch/

Modify inbatch.yaml for hyperparameter tuning and run_inbatch.sh for your own environment and paths.

bash run_inbatch.sh

Similarly, you can train the UniIR CLIP_FF and BLIP_SF models by modifying the corresponding scripts.

Evaluation

We provide the evaluation pipline for the UniIR models on the M-BEIR benchmark.

1. Environment

# From the root directory of the repo
conda env create -f faiss_env.yml

2. Scripts

UniIR CLIP_SF

cd src/models/unii_clip/clip_scorefusion/configs_scripts/large/eval/inbatch/

Modify embed.yaml and run_eval_pipeline_inbatch.sh for your own environment and paths.

bash run_eval_pipeline_inbatch.sh

Similarly, you can evaluate the UniIR CLIP_FF, BLIP_SF, and BLIP_FF models by modifying the corresponding scripts.

Model Zoo

TODO

Citation and Contact

BibTeX:

@article{wei2023uniir,
  title={UniIR: Training and Benchmarking Universal Multimodal Information Retrievers},
  author={Wei, Cong and Chen, Yang and Chen, Haonan and Hu, Hexiang and Zhang, Ge and Fu, Jie and Ritter, Alan and Chen, Wenhu},
  journal={arXiv preprint arXiv:2311.17136},
  year={2023}
}