Update table with new benchmark results #1361

awaelchli · 2024-04-26T01:03:01Z

Adds a new column to the config_hub/finetune/README.md with automated benchmarks as a follow up to #1337. The new column "Multitask score" covers MMLU at the moment. More categories will be added in the future.

The following settings were used to run MMLU:

litgpt evaluate
--checkpoint_dir ...
--batch_size 4
--device cuda
--dtype bfloat16
--tasks mmlu
...

Also removes the "Dataset" and "Precision" columns as they are constant, to make space.

config_hub/finetune/README.md

Co-authored-by: Sebastian Raschka <[email protected]>

awaelchli added 3 commits April 25, 2024 17:21

update description

10807f1

update table

2a39d54

update

feca72a

awaelchli requested review from carmocca and lantiga as code owners April 26, 2024 01:03

awaelchli requested a review from rasbt April 26, 2024 01:03

carmocca approved these changes Apr 26, 2024

View reviewed changes

rasbt reviewed Apr 26, 2024

View reviewed changes

config_hub/finetune/README.md Outdated Show resolved Hide resolved

rasbt approved these changes Apr 26, 2024

View reviewed changes

config_hub/finetune/README.md Outdated Show resolved Hide resolved

awaelchli and others added 3 commits April 26, 2024 07:48

Update config_hub/finetune/README.md

5f10f42

Co-authored-by: Sebastian Raschka <[email protected]>

Update config_hub/finetune/README.md

4d51b73

Co-authored-by: Sebastian Raschka <[email protected]>

add reference

967cfda

awaelchli merged commit 5895df1 into main Apr 26, 2024
9 checks passed

awaelchli deleted the docs/finetune-bench-numbers-2 branch April 26, 2024 16:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update table with new benchmark results #1361

Update table with new benchmark results #1361

awaelchli commented Apr 26, 2024

Update table with new benchmark results #1361

Update table with new benchmark results #1361

Conversation

awaelchli commented Apr 26, 2024