Release v1.0.1: Multi-GPU, HF dataloaders, MonoT5 rerankers and a brand new Wiki page · beir-cellar/beir

There have been multiple changes done to the repository ever since the last version. You can find the latest changes mentioned here below:

1. Brand New Wiki page for BEIR

Starting from v1.0.1, we have created a new Wiki page for the BEIR benchmark. We would keep it updated with the latest datasets available out there, examples of how you can evaluate your models on BEIR, Leaderboard, etc. Correspondingly we have shortened our README.md and displayed only necessary information out there. For a full overview. one can view the BEIR Wiki.

You can view the BEIR Wiki here: https://github.com/beir-cellar/beir/wiki.

2. Multi GPU evaluation with SBERT dense retrievers using Distributed Evaluation

Thanks to @NouamaneTazi, we currently now support multiple GPU evaluation for SBERT models across all datasets in BEIR. These benefit evaluation on large datasets such as BioASQ, where encoding takes 1 day at least to complete on a single GPU. Now with access to multi GPUs, one can evaluate large datasets quickly in contrast to old single GPU evaluation. Only Caveat, running on multiple GPUs requires evaluate library to be installed which has a python version requirement of >= 3.7.

Example: evaluate_sbert_multi_gpu.py

3. Hugging Face Data loader for BEIR dataset. Uploaded all datasets on HF.

We added Hugging Face Dataloaders for all the public BEIR datasets. One can use it to easily work with BEIR datasets available on Hugging Face. We also made available all corpus and queries for eg. BeIR/fiqa and qrels BeIR/fiqa-qrels for all public BEIR datasets on HuggingFace. This step would mean one does not need to download the datasets and keep the locally in RAM. Again thanks to @NouamaneTazi.

You can find all datasets here: https://huggingface.co/BeIR
Example: evaluate_sbert_hf_loader.py

4. Added support for the T5 reranking model: monoT5 reranker

We added the support of the monoT5 reranking model within BEIR. These are stronger (but complex) rerankers that can be used to attain the best reranking performances currently on the BEIR benchmark.

Example: evaluate_bm25_monot5_reranking.py

5. Fix: Add `ignore_identical_ids` with BEIR evaluation

Thanks to @kwang2049, we added a check to ignore identical ids within the evaluation script. This causes issues with ArguAna and Quora datasets, particularly as there a document and query can be alike (with the same id). By default, we remove these ids and evaluate the dataset accordingly. With this fix, one can evaluate Quora and ArguAna and provide the accurate and reproducible nDCG@10 scores.

5. Added HNSWSQ method in faiss retrieval methods

We added support to HNSWSQ faiss index method as a memory compression-based technique to evaluate across the BEIR datasets.

6. Added dependency of datasets library within setup.py

In order to support HF data loaders, we added the dependency of the datasets library within our setup.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.0.1: Multi-GPU, HF dataloaders, MonoT5 rerankers and a brand new Wiki page

1. Brand New Wiki page for BEIR

2. Multi GPU evaluation with SBERT dense retrievers using Distributed Evaluation

3. Hugging Face Data loader for BEIR dataset. Uploaded all datasets on HF.

4. Added support for the T5 reranking model: monoT5 reranker

5. Fix: Add `ignore_identical_ids` with BEIR evaluation

5. Added HNSWSQ method in faiss retrieval methods

6. Added dependency of datasets library within setup.py

Contributors

v1.0.1: Multi-GPU, HF dataloaders, MonoT5 rerankers and a brand new Wiki page

1. Brand New Wiki page for BEIR

2. Multi GPU evaluation with SBERT dense retrievers using Distributed Evaluation

3. Hugging Face Data loader for BEIR dataset. Uploaded all datasets on HF.

4. Added support for the T5 reranking model: monoT5 reranker

5. Fix: Add ignore_identical_ids with BEIR evaluation

5. Added HNSWSQ method in faiss retrieval methods

6. Added dependency of datasets library within setup.py

Contributors

5. Fix: Add `ignore_identical_ids` with BEIR evaluation