Releases: embeddings-benchmark/mteb
1.6.7
1.6.7 (2024-04-15)
Fix
-
fix: Add Japanese Social Network Posts Sentiment Classification Dataset (#358)
-
add JaQuAD dataset
-
fix
-
add dataset stats
-
add Japanese social network posts sentiment classification dataset
-
add stats
-
fix dataset stats
-
minor fix
-
fix merging
Co-authored-by: Kenneth Enevoldsen <[email protected]> (6433c87
)
1.6.6
1.6.6 (2024-04-15)
Fix
-
fix: Adding CodeSearchNet (#345)
-
add basic retrieval task
-
remove test code
-
fix metadata
-
avoid leaking labels
-
subset
-
fmt
-
points
-
attempt 1 at splitting by lang
-
add import in init
-
fix
-
this?
-
override for "Code"
-
inherit multiling
-
enable streaming
-
fmt
-
fix docs
-
Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- Update mteb/tasks/Retrieval/code/CodeSearchNetRetrieval.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- Update mteb/abstasks/TaskMetadata.py
Co-authored-by: Kenneth Enevoldsen <[email protected]>
-
populate
-
taxonomy
-
results
-
fmt
Co-authored-by: Kenneth Enevoldsen <[email protected]>
- fix: Minor fixes to codesearch metadata
Co-authored-by: Federico Cassano <[email protected]> (d1152e8
)
1.6.5
1.6.5 (2024-04-15)
Fix
-
fix: Add 3 tasks for Vietnamese (#364)
-
add Retrieval and Classification datasets for Vietnamese
-
remove avg length print
-
add VieMedEV
-
update meta and points
-
fix typo
-
fix merge
Co-authored-by: Kenneth Enevoldsen <[email protected]> (d054221
)
1.6.4
1.6.3
1.6.2
1.6.2 (2024-04-12)
Documentation
Added 6x2 points for guenthermi for datasets and 1 point to Muennighoff for review
I have not accounted for bonus points as I am not sure was what available at the time.
- docs: added point for #197
Added 2 points for rasdani and 2 bonus points for the first german retrieval (I believe). Added one point for each of the reviewers
- docs: added points for #116
This includes 6 points for 3 datasets to slvnwhrl +2 for first german clustering task also added points for reviews
- Added points for #134 cmteb
This includes 29 datasets (38 points) and 6x2 bonus points (12 points) for the 6 taskXlanguage which was not previously included.
All the points are attributed to @staoxiao, though we can split them if needed.
We also added points for review.
- docs: Added points for #137 polish
This includes points for 12 datasets (24) across 4 tasks (8). These points are given to rafalposwiata and then one point for review
- docs: Added points for #27 (spanish)
These include 9 datasets (18 points) across 4 news tasks (8) for spanish.
Points are given to violenil as the contributor, and one points for reviewers. Points can be split up if needed.
- docs: Added points for #224
Added points 2 points for the dataset. I could imagine that I might have missed some bonus points as well. Also added one point for review.
- docs: Added points for #210 (korean)
This include 3 datasets (6 points) across 1 new task (+2 bonus) for korean. Also added 1 points for reviewers.
- Add contributor
Co-authored-by: Niklas Muennighoff <[email protected]> (9dbf500
)
Fix
-
fix: Added Hindi discourse dataset (#346)
-
Added news classification dataset.
-
Fixes on suggestions
-
Added new medical qa dataset
-
Update model run files and model path
-
Added points for dataset.
-
Fixes
-
Added hindi discourse dataset
-
Added points
-
Added avg char length
-
Fixes
Co-authored-by: Kenneth Enevoldsen <[email protected]> (a55ae5f
)
Unknown
fix: removing bitextmining tasks from fr script (86ad02d
)
1.6.1
1.6.0
1.6.0 (2024-04-10)
Documentation
Feature
-
feat: Added new language code standard (#326)
-
fix: Added initial language code suggestion
-
docs: updated task metadata description
-
fix: changed folder structure to iso 639-3 codes
-
fix: Updated all language tags
-
clean: removed accidental results commit
-
fix: Add trusting of remote code to remove warning
-
fix: Added formatting
-
fix: trust remote code the flores dataset
-
docs: Added point for language rewrite
-
fix: reran linter after merge
-
fix: Added corrections from review
-
fix: Updated languages for newly added datasets
-
docs: added points for new annotations (
f0daece
)
1.5.6
1.5.6 (2024-04-10)
Documentation
- docs: add points and affiliation for MartinBernstorff (#335)
docs: update points.md (2903cb4
)
Fix
-
fix: Added medical qa dataset (#333)
-
Added news classification dataset.
-
Fixes on suggestions
-
Added new medical qa dataset
-
Update model run files and model path
-
Added points for dataset.
-
Fixes
Co-authored-by: Kenneth Enevoldsen <[email protected]> (80acc3e
)
Unknown
- Update pull_request_template.md (
84cffa2
)