Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flores 101 prompts #828

Open
wants to merge 44 commits into
base: eval-hackathon
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
da5e046
added first templates for DiaBLa dataset rbawden/DiaBLa
Apr 28, 2022
77ba6e9
declared previous_ref at beginning of templates
Apr 28, 2022
fa96aa0
declared more variables at beginning of templates
Apr 28, 2022
3a19cfb
declared more variables at beginning of templates
Apr 28, 2022
44c7551
declared more variables at beginning of templates
Apr 29, 2022
9379706
declared more variables at beginning of templates
Apr 29, 2022
f5fc2a9
corrected ref for mt in one template
Apr 29, 2022
ec3d5e8
corrected ref for mt in one template
Apr 29, 2022
64787de
Merge branch 'eval-hackathon' into diabla_prompts
rbawden Apr 29, 2022
ffee147
moved condition to just target side rather than around entire prompt
Apr 29, 2022
3434d9d
Merge branch 'diabla_prompts' of https://github.com/rbawden/promptsou…
Apr 29, 2022
7659b3d
allow template even when no context (beginning of dialogue)
Apr 29, 2022
88127d4
corrected templates that use past history
Apr 29, 2022
bf51583
corrected multiple choice answer field
May 9, 2022
5aa9190
updated templates - simplified some targets and added translation com…
May 11, 2022
7d1fa37
updated templates - simplified some targets and added translation com…
May 11, 2022
2728caf
corrected duplicate name
May 11, 2022
d358fe1
updated duplicate definition
May 11, 2022
da98d59
corrected error of only two pipes in answer choices
May 12, 2022
54abd8a
corrected -2 index to -1 - duplicate defintiions
May 12, 2022
5b0ef0e
Merge branch 'eval-hackathon'
May 12, 2022
84914a3
pulled new branch eval-hackathon & added diabla templates
May 12, 2022
833801e
Merge branch 'eval-hackathon' into diabla_prompts
May 12, 2022
54b8082
Merge branch 'eval-hackathon' into diabla_prompts
rbawden May 13, 2022
f240020
merged with eval-hackathon updates
May 13, 2022
de6cb66
merge
May 16, 2022
26419f9
corrected discriminate mt ref
May 16, 2022
013bb63
removed directional templates and only keep both directions (analysis…
May 17, 2022
7b73d78
simplified names, changed random to choice
May 17, 2022
63aae1a
merged
May 22, 2022
ce61f76
added automatically generated prompts for flores-101 (all subset) - a…
May 22, 2022
aac04e6
merged with diabla_prompts branch
May 22, 2022
387c4a5
Merge branch 'eval-hackathon' into flores-101-prompts
rbawden May 22, 2022
74ce886
merge templates (add subfolder)
Jun 16, 2022
b6fa2e7
changed choice of template to one of the form: SRC: src_text = TRG: |…
Jun 16, 2022
94e01c7
Merge branch 'eval-hackathon' into flores-101-prompts
VictorSanh Jun 25, 2022
23da32f
style and format
VictorSanh Jun 25, 2022
9adcf6c
update with new prompt
Jun 29, 2022
bd1f5c2
Merge branch 'flores-101-prompts' of https://github.com/rbawden/promp…
Jun 29, 2022
3eee291
update flores prompts to xglm
Sep 27, 2022
58d934b
regenerated prompts
Sep 27, 2022
fa47d41
regenerated prompts
Sep 27, 2022
0ab76cc
regenerated prompts
Sep 27, 2022
9b35a8a
regenerated prompts
Sep 27, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 23 additions & 7 deletions promptsource/templates.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,17 @@
# These are users whose datasets should be included in the results returned by
# filter_english_datasets (regardless of their metadata)

INCLUDED_USERS = {"Zaid", "craffel", "GEM", "aps", "khalidalt", "shanya", "rbawden", "BigScienceBiasEval", "gsarti"}
INCLUDED_USERS = {
"Zaid",
"craffel",
"GEM",
"aps",
"khalidalt",
"shanya",
"rbawden",
"BigScienceBiasEval",
"gsarti",
}

# These are the metrics with which templates can be tagged
METRICS = {
Expand Down Expand Up @@ -79,7 +89,7 @@
"ce": "Chechen",
"ny": "Chichewa, Chewa, Nyanja",
"zh": "Chinese",
"cu": "Church Slavic, Old Slavonic, Church Slavonic, Old Bulgarian, Old Church Slavonic",
"cu": "Church Slavic, Old Slavonic, Church Slavonic, Old Bulgarian, Old Church Slavonic",
"cv": "Chuvash",
"kw": "Cornish",
"co": "Corsican",
Expand Down Expand Up @@ -120,7 +130,7 @@
"io": "Ido",
"ig": "Igbo",
"id": "Indonesian",
"ia": "Interlingua (International Auxiliary Language Association)",
"ia": "Interlingua (International Auxiliary Language Association)",
"ie": "Interlingue, Occidental",
"iu": "Inuktitut",
"ik": "Inupiaq",
Expand Down Expand Up @@ -181,7 +191,7 @@
"pt": "Portuguese",
"pa": "Punjabi, Panjabi",
"qu": "Quechua",
"ro": "Romanian, Moldavian, Moldovan",
"ro": "Romanian, Moldavian, Moldovan",
"rm": "Romansh",
"rn": "Rundi",
"ru": "Russian",
Expand Down Expand Up @@ -212,7 +222,7 @@
"th": "Thai",
"bo": "Tibetan",
"ti": "Tigrinya",
"to": "Tonga (Tonga Islands)",
"to": "Tonga (Tonga Islands)",
"ts": "Tsonga",
"tn": "Tswana",
"tr": "Turkish",
Expand Down Expand Up @@ -493,7 +503,10 @@ def _collect_datasets(self) -> Dict[Tuple[str, str], "DatasetTemplates"]:
for dataset in dataset_folders:
if dataset in INCLUDED_USERS:
for filename in os.listdir(os.path.join(TEMPLATES_FOLDER_PATH, dataset)):
output = {**output, **self._collect_dataset(dataset + "/" + filename)}
output = {
**output,
**self._collect_dataset(dataset + "/" + filename),
}
else:
output = {**output, **self._collect_dataset(dataset)}
return output
Expand Down Expand Up @@ -587,7 +600,10 @@ def format_for_dump(self) -> Dict:
"""
Create a formatted dictionary for the class attributes
"""
formatted_dict = {self.DATASET_KEY: self.dataset_name, self.TEMPLATES_KEY: self.templates}
formatted_dict = {
self.DATASET_KEY: self.dataset_name,
self.TEMPLATES_KEY: self.templates,
}
if self.subset_name:
formatted_dict[self.SUBSET_KEY] = self.subset_name
return formatted_dict
Expand Down
Loading