Skip to content

Actions: EleutherAI/lm-evaluation-harness

Tasks Modified

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,672 workflow run results
1,672 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Fix AttributeError in huggingface.py When 'model_type' is Missing (#1…
Tasks Modified #1860: Commit cc771ec pushed by haileyschoelkopf
February 27, 2024 23:31 16s main
February 27, 2024 23:31 16s
Fix AttributeError in huggingface.py When 'model_type' is Missing
Tasks Modified #1859: Pull request #1489 synchronize by haileyschoelkopf
February 27, 2024 21:54 17s richwardle:main
February 27, 2024 21:54 17s
Fix AttributeError in huggingface.py When 'model_type' is Missing
Tasks Modified #1858: Pull request #1489 synchronize by haileyschoelkopf
February 27, 2024 21:53 13s richwardle:main
February 27, 2024 21:53 13s
update name of val split in truthfulqa multilingual (#1488)
Tasks Modified #1856: Commit a08eb87 pushed by haileyschoelkopf
February 27, 2024 19:36 3m 9s main
February 27, 2024 19:36 3m 9s
Update TruthfulQA val split name
Tasks Modified #1854: Pull request #1488 opened by haileyschoelkopf
February 27, 2024 19:04 3m 39s fix-multilingual-tqa-splits
February 27, 2024 19:04 3m 39s
add multilingual mmlu eval (#1484)
Tasks Modified #1852: Commit 7cd004c pushed by haileyschoelkopf
February 27, 2024 15:42 2m 23s main
February 27, 2024 15:42 2m 23s
Refactor evaluater.evaluate (#1441)
Tasks Modified #1851: Commit 5ccd65d pushed by haileyschoelkopf
February 27, 2024 14:04 1m 37s main
February 27, 2024 14:04 1m 37s
add multilingual mmlu eval
Tasks Modified #1850: Pull request #1484 opened by jordane95
February 27, 2024 13:44 2m 8s jordane95:main
February 27, 2024 13:44 2m 8s
Add a new task GPQA (the part CoT and generative)
Tasks Modified #1849: Pull request #1482 synchronize by uanu2002
February 27, 2024 06:26 1m 47s uanu2002:gpqa_cot
February 27, 2024 06:26 1m 47s
Add a new task GPQA (the part CoT and generative)
Tasks Modified #1848: Pull request #1482 synchronize by uanu2002
February 27, 2024 06:23 1m 32s uanu2002:gpqa_cot
February 27, 2024 06:23 1m 32s
Add a new task GPQA (the part CoT and generative)
Tasks Modified #1847: Pull request #1482 synchronize by uanu2002
February 27, 2024 06:17 1m 46s uanu2002:gpqa_cot
February 27, 2024 06:17 1m 46s
Add a new task GPQA (the part CoT and generative)
Tasks Modified #1846: Pull request #1482 opened by uanu2002
February 27, 2024 03:19 1m 32s uanu2002:gpqa_cot
February 27, 2024 03:19 1m 32s
Transfer zero-shot BBH parsing improvements to few-shot BBH
Tasks Modified #1845: Pull request #1481 opened by haileyschoelkopf
February 26, 2024 20:58 4m 40s fix-fewshot-bbh-parsing
February 26, 2024 20:58 4m 40s
Always include EOS token as stop sequence
Tasks Modified #1844: Pull request #1480 opened by haileyschoelkopf
February 26, 2024 20:41 13s use-eos-default
February 26, 2024 20:41 13s
Refactor evaluater.evaluate
Tasks Modified #1843: Pull request #1441 synchronize by baberabb
February 26, 2024 16:55 1m 51s baberabb:eval
February 26, 2024 16:55 1m 51s
Refactor evaluater.evaluate
Tasks Modified #1842: Pull request #1441 synchronize by baberabb
February 26, 2024 16:46 1m 35s baberabb:eval
February 26, 2024 16:46 1m 35s
Cont metrics (#1475)
Tasks Modified #1841: Commit 96d185f pushed by haileyschoelkopf
February 26, 2024 16:12 2m 28s main
February 26, 2024 16:12 2m 28s
Cont metrics
Tasks Modified #1840: Pull request #1475 synchronize by haileyschoelkopf
February 26, 2024 16:05 2m 41s cont-metrics
February 26, 2024 16:05 2m 41s
Cont metrics
Tasks Modified #1839: Pull request #1475 synchronize by haileyschoelkopf
February 26, 2024 16:04 1m 50s cont-metrics
February 26, 2024 16:04 1m 50s
Cont metrics
Tasks Modified #1838: Pull request #1475 synchronize by haileyschoelkopf
February 26, 2024 16:02 2m 26s cont-metrics
February 26, 2024 16:02 2m 26s
Cont metrics
Tasks Modified #1836: Pull request #1475 synchronize by lintangsutawika
February 26, 2024 15:45 2m 8s cont-metrics
February 26, 2024 15:45 2m 8s
Cont metrics
Tasks Modified #1835: Pull request #1475 synchronize by lintangsutawika
February 26, 2024 15:45 1m 51s cont-metrics
February 26, 2024 15:45 1m 51s