Release v0.2.3: NeurIPS acceptance, Multilingual datasets and Top-k accuracy metric fixed · beir-cellar/beir

This is a small release update!

1. BEIR Benchmark paper accepted at NeurIPS 2021 (Datasets and Benchmark Track)

I'm quite thrilled to share that BEIR has been accepted at NeurIPS 2021 conference. All reviewers had positive reviews and realized the benchmark to be useful for the community. More information can be found here: https://openreview.net/forum?id=wCu6T5xFjeJ.

2. New Multilingual datasets added within BEIR

New multilingual datasets have been added to the BEIR Benchmark. Now BEIR supports over 10+ languages. We included the translated MSMARCO dataset in 8 languages: mMARCO (https://github.com/unicamp-dl/mMARCO) and Mr.TyDi which contains train, development, and test data across 10 languages (https://github.com/castorini/mr.tydi). We hope to provide good and robust dense multilingual retrievers in the future.

3. Breaking change in Top-k accuracy now fixed

The top-k accuracy metric was by mistake sorting retrieved keys instead of retriever model scores which would have led to incorrect scores. This mistake has been identified in #45 and successfully updated and merged now.

4. Yannic Kilcher recognized BEIR as a helpful ML library

Yannic Kilcher recently mentioned the BEIR repository as a helpful library for benchmarking and evaluating diverse IR models and architectures. You can find more details in his latest ML News video on YouTube: https://www.youtube.com/watch?v=K3cmxn5znyU&t=1290s&ab_channel=YannicKilcher

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.2.3: NeurIPS acceptance, Multilingual datasets and Top-k accuracy metric fixed

1. BEIR Benchmark paper accepted at NeurIPS 2021 (Datasets and Benchmark Track)

2. New Multilingual datasets added within BEIR

3. Breaking change in Top-k accuracy now fixed

4. Yannic Kilcher recognized BEIR as a helpful ML library