Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new benchmark: Spanish bench #2157

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

zxcvuser
Copy link
Contributor

SpanishBench is a benchmark for tasks in Spanish that cover several evaluation areas. The datasets consist of professional translations of relevant English datasets and newly created datasets in Spanish. The README.md contains detailed information on all the tasks included in the benchmark.

@CLAassistant
Copy link

CLAassistant commented Jul 30, 2024

CLA assistant check
All committers have signed the CLA.

@lintangsutawika
Copy link
Contributor

@zxcvuser need little help for running pre-commit run --all-files and should be good.

@baberabb
Copy link
Contributor

baberabb commented Sep 18, 2024

Getting a evaluate/blue related error (not due to this PR):

AttributeError: 'DownloadConfig' object has no attribute 'use_auth_token'

Will fix the underlying cause before merging!

edit: might be due to name shadowing of paws_es

@@ -0,0 +1,28 @@
group: flores
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
group: flores
tag: flores

@@ -0,0 +1,20 @@
group:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this please

@@ -0,0 +1,20 @@
group:
- pawsx
task: paws_es
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe paws_ex_bench? there is already a paws_es task

@@ -0,0 +1,24 @@
group: phrases_es
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
group: phrases_es
tag: phrases_es

@zxcvuser
Copy link
Contributor Author

These are the changes done:

  • Added the benchmark info in lm_eval/tasks/README.md
  • Replaced "-" by "_" in the create_files script in flores_es and added weight_by_size: false
  • Run linters
  • Remove grouping in paws_es, xnli, and mgsm tasks (they were pointing to pre-existing benchmarks) and renamed the tasks with the suffix "_spanish_bench", both in each of the task yamls and in the spanish_bench.yaml
  • Changed "group" to "tag" in phrases_es

With these, it should all be fine now. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants