Skip to content

Releases: embeddings-benchmark/mteb

1.6.7

15 Apr 08:44
Compare
Choose a tag to compare

1.6.7 (2024-04-15)

Fix

  • fix: Add Japanese Social Network Posts Sentiment Classification Dataset (#358)

  • add JaQuAD dataset

  • fix

  • add dataset stats

  • add Japanese social network posts sentiment classification dataset

  • add stats

  • fix dataset stats

  • minor fix

  • fix merging


Co-authored-by: Kenneth Enevoldsen <[email protected]> (6433c87)

1.6.6

15 Apr 08:34
Compare
Choose a tag to compare

1.6.6 (2024-04-15)

Fix

  • fix: Add code search (edit of #345) (#371)

  • fix: Adding CodeSearchNet (#345)

  • add basic retrieval task

  • remove test code

  • fix metadata

  • avoid leaking labels

  • subset

  • fmt

  • points

  • attempt 1 at splitting by lang

  • add import in init

  • fix

  • this?

  • override for "Code"

  • inherit multiling

  • enable streaming

  • fmt

  • fix docs

  • Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • Update mteb/abstasks/TaskMetadata.py

Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • populate

  • taxonomy

  • results

  • fmt


Co-authored-by: Kenneth Enevoldsen <[email protected]>

  • fix: Minor fixes to codesearch metadata

Co-authored-by: Federico Cassano <[email protected]> (d1152e8)

1.6.5

15 Apr 07:59
Compare
Choose a tag to compare

1.6.5 (2024-04-15)

Fix

  • fix: Add 3 tasks for Vietnamese (#364)

  • add Retrieval and Classification datasets for Vietnamese

  • remove avg length print

  • add VieMedEV

  • update meta and points

  • fix typo

  • fix merge


Co-authored-by: Kenneth Enevoldsen <[email protected]> (d054221)

1.6.4

15 Apr 07:41
Compare
Choose a tag to compare

1.6.4 (2024-04-15)

Fix

  • fix: Remove Mr. TyDi Datasets (#363)

  • remove tydi

  • add points and remove from korean benchmark (b015148)

1.6.3

14 Apr 09:09
Compare
Choose a tag to compare

1.6.3 (2024-04-14)

Documentation

  • docs: Updated examples in docs (adding a dataset) (#353)

  • docs: Updated examples in documentation

  • docs: Updated taskmetadata to better describe edge cases (9b63f50)

  • docs: Updated examples in documentation (#351) (6832cf0)

Fix

  • fix: add Japanese Question Answering Dataset (JaQuAD) dataset (#352)

  • add JaQuAD dataset

  • fix

  • add dataset stats

  • fix dataset stats (ae6adf4)

1.6.2

12 Apr 18:42
Compare
Choose a tag to compare

1.6.2 (2024-04-12)

Documentation

  • docs: Added points for previous submissions (#344)

  • docs: Added missing points for #214

Added 6x2 points for guenthermi for datasets and 1 point to  Muennighoff for review

I have not accounted for bonus points as I am not sure was what available at the time.

  • docs: added point for #197

Added 2 points for rasdani and 2 bonus points for the first german retrieval (I believe). Added one point for each of the reviewers

  • docs: added points for #116

This includes 6 points for 3 datasets to slvnwhrl +2 for first german clustering task also added points for reviews

  • Added points for #134 cmteb

This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included.

All the points are attributed to @staoxiao, though we can split them if needed.

We also added points for review.

  • docs: Added points for #137 polish

This includes points for 12 datasets (24) across 4 tasks (8). These points are given to rafalposwiata and then one point for review

  • docs: Added points for #27 (spanish)

These include 9 datasets (18 points) across 4 news tasks (8) for spanish.

Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed.

  • docs: Added points for #224

Added points 2 points for the dataset. I could imagine that I might have missed some bonus points as well. Also added one point for review.

  • docs: Added points for #210 (korean)

This include 3 datasets (6 points) across 1 new task (+2 bonus) for korean. Also added 1 points for reviewers.

  • Add contributor

Co-authored-by: Niklas Muennighoff <[email protected]> (9dbf500)

Fix

  • fix: Added Hindi discourse dataset (#346)

  • Added news classification dataset.

  • Fixes on suggestions

  • Added new medical qa dataset

  • Update model run files and model path

  • Added points for dataset.

  • Fixes

  • Added hindi discourse dataset

  • Added points

  • Added avg char length

  • Fixes


Co-authored-by: Kenneth Enevoldsen <[email protected]> (a55ae5f)

Unknown

  • Update readme.md (2dec4e9)

  • Removing bitext mining tasks from fr evaluation script (#341)

fix: removing bitextmining tasks from fr script (86ad02d)

1.6.1

11 Apr 08:10
Compare
Choose a tag to compare

1.6.1 (2024-04-11)

Documentation

  • docs: Update mmteb (#338)

docs: update mmteb (bee4244)

Fix

  • fix: missing json and updated tests to not run in editable mode (#340)

  • fix: Added json files to pyproject.toml

  • ci: avoid using -e when installing for tests (17c809d)

1.6.0

10 Apr 18:06
Compare
Choose a tag to compare

1.6.0 (2024-04-10)

Documentation

Feature

  • feat: Added new language code standard (#326)

  • fix: Added initial language code suggestion

  • docs: updated task metadata description

  • fix: changed folder structure to iso 639-3 codes

  • fix: Updated all language tags

  • clean: removed accidental results commit

  • fix: Add trusting of remote code to remove warning

  • fix: Added formatting

  • fix: trust remote code the flores dataset

  • docs: Added point for language rewrite

  • fix: reran linter after merge

  • fix: Added corrections from review

  • fix: Updated languages for newly added datasets

  • docs: added points for new annotations (f0daece)

1.5.6

10 Apr 17:22
Compare
Choose a tag to compare

1.5.6 (2024-04-10)

Documentation

  • docs: add points and affiliation for MartinBernstorff (#335)

docs: update points.md (2903cb4)

Fix

  • fix: Added medical qa dataset (#333)

  • Added news classification dataset.

  • Fixes on suggestions

  • Added new medical qa dataset

  • Update model run files and model path

  • Added points for dataset.

  • Fixes


Co-authored-by: Kenneth Enevoldsen <[email protected]> (80acc3e)

Unknown

  • Update pull_request_template.md (84cffa2)

1.5.5

09 Apr 07:29
Compare
Choose a tag to compare

1.5.5 (2024-04-09)

Fix

  • fix: Improve logging when the revision is None (#329) (404587b)