-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add KoCommonGEN v2 benchmark #2208
base: main
Are you sure you want to change the base?
Conversation
|
# "choices": [f"{doc[str(i+1)]}" for i in range(4)], | ||
# "choices": [f'{str(i+1)}. ' + doc['{i}'.format(i=i + 1)] for i in range(4)], # The list of choices. | ||
# "choices": [str(i+1) for i in range(4)], # The list of choices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want to check if this is an alternative option (which is why it's commented but left in)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, if this comment is safe to delete let's do so!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Just a few small changes and then we can merge this.
@@ -0,0 +1,19 @@ | |||
task: ko_commongen_v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
task: ko_commongen_v2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the task:
field since this is a template/stub config
# "choices": [f"{doc[str(i+1)]}" for i in range(4)], | ||
# "choices": [f'{str(i+1)}. ' + doc['{i}'.format(i=i + 1)] for i in range(4)], # The list of choices. | ||
# "choices": [str(i+1) for i in range(4)], # The list of choices. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, if this comment is safe to delete let's do so!
out_doc = { | ||
"query": query, | ||
"choices": [f"{i+1}. {doc[str(i+1)]}" for i in range(4)], | ||
# "choices": [f"{doc[str(i+1)]}" for i in range(4)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with here
Hi @metterian , just following up to see if you'd be able to make these final few changes so we can merge this task! If not we'll try to get to them ourselves. Note also that we'd ideally have an entry in https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/tasks/README.md describing the task as well, so users know about your task! |
Description:
This PR adds support for the KoCommonGEN v2 benchmark, a new dataset for evaluating Korean commonsense reasoning in large language models.
Changes:
KoCommonGEN v2 Details:
This benchmark provides a valuable resource for evaluating Korean language models on commonsense reasoning tasks. Adding it to our evaluation suite will help broaden our coverage of multilingual NLP capabilities.
Please review and let me know if any changes or additional information is needed.