Skip to content

Commit

Permalink
Merge pull request #979 from EleutherAI/benchmark-documentation
Browse files Browse the repository at this point in the history
add description on task/group alias
  • Loading branch information
StellaAthena committed Nov 20, 2023
2 parents afda655 + e376ad8 commit e3644fc
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions docs/new_task_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,24 @@ to the top of any Python file that is run or imported when performing evaluation
Passing `--tasks /path/to/yaml/file` is also accepted.


## Beautifying Table Display

To avoid conflict, each task needs to be registered with a unique name. Because of this, slight variations of task are still counted as unique tasks and need to be named uniquely. This could be done by appending an additional naming that may refer to the variation such as in MMLU where the template used to evaluated for flan are differentiated from the default by the prefix `mmlu_flan_*`. Printing the full task names can easily clutter the results table at the end of the evaluation especially when you have a long list of tasks or are using a benchmark that comprises of many tasks. To make it more legible, you can use `task_alias` and `group_alias` to provide an alternative task name and group name that will be printed.
``
for example in `mmlu_abstract_algebra.yaml` we set `group_alias` to `stem` and `task_alias` to `abstract_algebra`.

```
"dataset_name": "abstract_algebra"
"description": "The following are multiple choice questions (with answers) about abstract\
\ algebra.\n\n"
"group": "mmlu_stem"
"group_alias": "stem"
"include": "_default_template_yaml"
"task": "mmlu_abstract_algebra"
"task_alias": "abstract_algebra"
```
Note: Even though `group` can be a list, for now, `group_alias` can only be a single string.
## Checking validity
After registering your task, you can now check on your data downloading and verify that the few-shot samples look as intended. Run the following command with your desired args:
Expand Down

0 comments on commit e3644fc

Please sign in to comment.