diff --git a/docs/new_task_guide.md b/docs/new_task_guide.md index 86966be541..9debb8ef30 100644 --- a/docs/new_task_guide.md +++ b/docs/new_task_guide.md @@ -273,6 +273,24 @@ to the top of any Python file that is run or imported when performing evaluation Passing `--tasks /path/to/yaml/file` is also accepted. +## Beautifying Table Display + +To avoid conflict, each task needs to be registered with a unique name. Because of this, slight variations of task are still counted as unique tasks and need to be named uniquely. This could be done by appending an additional naming that may refer to the variation such as in MMLU where the template used to evaluated for flan are differentiated from the default by the prefix `mmlu_flan_*`. Printing the full task names can easily clutter the results table at the end of the evaluation especially when you have a long list of tasks or are using a benchmark that comprises of many tasks. To make it more legible, you can use `task_alias` and `group_alias` to provide an alternative task name and group name that will be printed. +`` +for example in `mmlu_abstract_algebra.yaml` we set `group_alias` to `stem` and `task_alias` to `abstract_algebra`. + +``` +"dataset_name": "abstract_algebra" +"description": "The following are multiple choice questions (with answers) about abstract\ + \ algebra.\n\n" +"group": "mmlu_stem" +"group_alias": "stem" +"include": "_default_template_yaml" +"task": "mmlu_abstract_algebra" +"task_alias": "abstract_algebra" +``` +Note: Even though `group` can be a list, for now, `group_alias` can only be a single string. + ## Checking validity After registering your task, you can now check on your data downloading and verify that the few-shot samples look as intended. Run the following command with your desired args: