Skip to content

Commit

Permalink
PR fixing the issue EleutherAI#1391 (wrong contexts in the mgsm task) (
Browse files Browse the repository at this point in the history
…EleutherAI#1440)

* fix the issue EleutherAI#1391, wrong contexts in mgsm tasks

* fix yaml issue for having two target_delimiter lines. For COT tasks, keep the one with a space (default)

* regenerate all task yaml files
- change naming so that file name will match with task name
- task|file follows a consistent naming way, mgsm_(mode)_(lang) for three modes, i.e., direct, en_cot, and native_cot

* English CoTs should have a space as target_delimiter

* Update utils.py

* Apply suggestions from code review

---------

Co-authored-by: Hailey Schoelkopf <[email protected]>
  • Loading branch information
2 people authored and nightingal3 committed May 2, 2024
1 parent b5ee714 commit 6be1045
Show file tree
Hide file tree
Showing 48 changed files with 193 additions and 162 deletions.
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_bn.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: bn
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"প্রশ্ন:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[17:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"প্রশ্ন: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_bn
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_de.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: de
doc_to_target: '{% if answer is not none %}{{answer[7+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAntwort"}}{% else %}{{"Frage:
"+question+"\nAntwort"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[29:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAntwort:"}}{% else %}{{"Frage: "+question+"\nAntwort:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_de
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_en.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: en
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"Question:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[21:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"Question: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_en
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_es.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: es
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"Pregunta:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[23:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nRespuesta:"}}{% else %}{{"Pregunta: "+question+"\nRespuesta:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_es
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_fr.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: fr
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"Question
: "+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[26:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nRéponse :"}}{% else %}{{"Question : "+question+"\nRéponse :"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_fr
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_ja.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: ja
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"問題: "+question+"\nAnswer"}}{%
endif %}'
doc_to_target: '{% if answer is not none %}{{answer[11:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"問題: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_ja
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_ru.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: ru
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"Задача:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[18:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"Задача: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_ru
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_sw.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: sw
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"Swali:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[25:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"Swali: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_sw
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_te.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: te
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"ప్రశ్న:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[19:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"ప్రశ్న: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_te
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_th.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: th
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"โจทย์:
"+question+"\nAnswer"}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[18:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"โจทย์: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_th
6 changes: 2 additions & 4 deletions lm_eval/tasks/mgsm/direct/mgsm_direct_zh.yaml
Original file line number Diff line number Diff line change
@@ -1,8 +1,6 @@
# Generated by utils.py
dataset_name: zh
doc_to_target: '{% if answer is not none %}{{answer[6+1]}}{% else %}{{answer_number|string}}{%
endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer"}}{% else %}{{"问题: "+question+"\nAnswer"}}{%
endif %}'
doc_to_target: '{% if answer is not none %}{{answer[6:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nAnswer:"}}{% else %}{{"问题: "+question+"\nAnswer:"}}{% endif %}'
include: direct_yaml
task: mgsm_direct_zh
1 change: 0 additions & 1 deletion lm_eval/tasks/mgsm/en_cot/cot_yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ dataset_name: null # Overridden by language-specific config.
output_type: generate_until
training_split: train
test_split: test
target_delimiter: ""
generation_kwargs:
until:
- "\n\n"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: bn
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[17:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"প্রশ্ন: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_bn_direct
task: mgsm_en_cot_bn
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: de
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[29:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Frage: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_de_direct
task: mgsm_en_cot_de
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: en
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[21:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Question: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_en_direct
task: mgsm_en_cot_en
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: es
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[23:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Pregunta: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_es_direct
task: mgsm_en_cot_es
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: fr
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[26:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Question : "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_fr_direct
task: mgsm_en_cot_fr
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: ja
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[11:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"問題: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_ja_direct
task: mgsm_en_cot_ja
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: ru
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[18:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Задача: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_ru_direct
task: mgsm_en_cot_ru
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: sw
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[25:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Swali: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_sw_direct
task: mgsm_en_cot_sw
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: te
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[19:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"ప్రశ్న: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_te_direct
task: mgsm_en_cot_te
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: th
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[18:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"โจทย์: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_th_direct
task: mgsm_en_cot_th
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Generated by utils.py
dataset_name: zh
doc_to_target: '{% if answer is not none %}{{answer[20+1]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_target: '{% if answer is not none %}{{answer[6:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"问题: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
include: cot_yaml
task: mgsm_zh_direct
task: mgsm_en_cot_zh
5 changes: 5 additions & 0 deletions lm_eval/tasks/mgsm/gen_yaml.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
#!/bin/bash

python utils.py --overwrite --output-dir direct --mode direct
python utils.py --overwrite --output-dir en_cot --mode en-cot
python utils.py --overwrite --output-dir native_cot --mode native-cot
2 changes: 1 addition & 1 deletion lm_eval/tasks/mgsm/native_cot/cot_yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ dataset_name: null # Overridden by language-specific config.
output_type: generate_until
training_split: train
test_split: test
target_delimiter: ""
# target_delimiter: ""
generation_kwargs:
until:
- "\n\n"
Expand Down
8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_bn.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_de.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_en.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_es.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_fr.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_ja.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_ru.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_sw.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_te.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_th.yaml

This file was deleted.

8 changes: 0 additions & 8 deletions lm_eval/tasks/mgsm/native_cot/mgsm_cot_native_zh.yaml

This file was deleted.

12 changes: 12 additions & 0 deletions lm_eval/tasks/mgsm/native_cot/mgsm_native_cot_bn.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Generated by utils.py
dataset_name: bn
doc_to_target: '{% if answer is not none %}{{answer[17:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nধাপে ধাপে উত্তর:"}}{% else %}{{"প্রশ্ন: "+question+"\nধাপে ধাপে উত্তর:"}}{% endif %}'
filter_list:
- filter:
- function: regex
regex_pattern: The answer is (\-?[0-9\.\,]+)
- function: take_first
name: get-answer
include: cot_yaml
task: mgsm_native_cot_bn
12 changes: 12 additions & 0 deletions lm_eval/tasks/mgsm/native_cot/mgsm_native_cot_de.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Generated by utils.py
dataset_name: de
doc_to_target: '{% if answer is not none %}{{answer[29:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nSchritt-für-Schritt-Antwort:"}}{% else %}{{"Frage: "+question+"\nSchritt-für-Schritt-Antwort:"}}{% endif %}'
filter_list:
- filter:
- function: regex
regex_pattern: Die Antwort lautet (\-?[0-9\.\,]+)
- function: take_first
name: get-answer
include: cot_yaml
task: mgsm_native_cot_de
12 changes: 12 additions & 0 deletions lm_eval/tasks/mgsm/native_cot/mgsm_native_cot_en.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Generated by utils.py
dataset_name: en
doc_to_target: '{% if answer is not none %}{{answer[21:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nStep-by-Step Answer:"}}{% else %}{{"Question: "+question+"\nStep-by-Step Answer:"}}{% endif %}'
filter_list:
- filter:
- function: regex
regex_pattern: The answer is (\-?[0-9\.\,]+)
- function: take_first
name: get-answer
include: cot_yaml
task: mgsm_native_cot_en
12 changes: 12 additions & 0 deletions lm_eval/tasks/mgsm/native_cot/mgsm_native_cot_es.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Generated by utils.py
dataset_name: es
doc_to_target: '{% if answer is not none %}{{answer[23:]}}{% else %}{{answer_number|string}}{% endif %}'
doc_to_text: '{% if answer is not none %}{{question+"\nRespuesta paso a paso:"}}{% else %}{{"Pregunta: "+question+"\nRespuesta paso a paso:"}}{% endif %}'
filter_list:
- filter:
- function: regex
regex_pattern: La respuesta es (\-?[0-9\.\,]+)
- function: take_first
name: get-answer
include: cot_yaml
task: mgsm_native_cot_es
Loading

0 comments on commit 6be1045

Please sign in to comment.