Release v0.2.0: New Features Integrated with BEIR · beir-cellar/beir

FAISS Indexes and Search Integration

FAISS Indexes can be created and used for evaluation using the BEIR repository. We have added support to Flat-IP, HNSW, PQ, PCAMatrix, and BinaryFlat Indexes.
Faiss indexes use various compression algorithms useful for reducing Index memory sizes or improving retrieval speed.
You can also save your corpus embeddings as a faiss index, which wasn't possible with the exact search originally.
Check out how to evaluate dense retrieval using a faiss index [here] and dimension reduction using PCA [here].

Thanks to @julian-risch, we have added our first multilingual dataset to the BEIR repository - GermanQuAD (German SQuAD dataset).
We have changed Elasticsearch now to allow evaluation on languages apart from English, check it out [here].
We also have added a DPR model class which lets you load DPR models from Huggingface Repo, you can use this Class now for evaluation let's say the GermanDPR model [link].

We have transformed the original DeepCT code to be able to use tensorflow (tf) >v2.0 and now hosted the latest repo [here].
Using the hosted code, we are now able to use DeepCT for evaluation in BEIR using Anserini Retrieval, check [here].

From the SentenceTransformers repository, we have integrated the latest training code for MSMARCO on custom manually provided hard negatives. This provides the state-of-the-art SBERT models trained on MSMARCO, check [here].

A big challenge was to use multiple GPUs for the generation of questions much faster. We have included Process-pools to generate questions much faster and now using multiple GPUs also in parallel, check [here].

BPR (ACL'21, link) is now integrated within the BEIR benchmark. Now you can easily train a state-of-the-art BPR model on MSMARCO using the loss function described in the original paper, check [here].
You can also evaluate BPR now easily now in a zero-shot evaluation fashion, check [here].
We would soon open-source the BPR public models trained on MSMARCO.