Skip to content

housing our model example of fine tuning an 11B t5 with FSDP

Notifications You must be signed in to change notification settings

lessw2020/t5_11

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t5_11

housing our model example of fine tuning an 11B t5 with FSDP to create a world-class grammar checker. children_correction

to get going...

pip install -r requirements.txt

a large and small dataset are already present in the project (grammar_train.csv = small, gtrain_150K.csv = large).

to baseline your environment or this model (adjust nproc to equal your gpu count):

torchrun --nnodes=1 --nproc_per_node=8 --rdzv_id=101 --rdzv_endpoint="localhost:5679" main_benchmark.py  

On an A100 (p4d.24xlarge) you should expect to see:

benchmark_t5

To train with mp spawn:

python main.py

Or better, with torchrun:

torchrun --nnodes=1 --nproc_per_node=8 --rdzv_id=101 --rdzv_endpoint="localhost:5679" main_elastic.py  

You can control the model size, dataset size, batch size, etc. all in the config/defaults.py

About

housing our model example of fine tuning an 11B t5 with FSDP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published