Skip to content
/ t5_11 Public
forked from lessw2020/t5_11

housing our model example of fine tuning an 11B t5 with FSDP

Notifications You must be signed in to change notification settings

lchu-ibm/t5_11

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

t5_11

housing our model example of fine tuning an 11B t5 with FSDP to create a world-class grammar checker. children_correction

to get going...

pip install -r requirements.txt

a large and small dataset are already present in the project (grammar_train.csv = small, gtrain_150K.csv = large).

to baseline your environment or this model (adjust nproc to equal your gpu count):

torchrun --nnodes=1 --nproc_per_node=8 --rdzv_id=101 --rdzv_endpoint="localhost:5679" main_benchmark.py  

On an A100 (p4d.24xlarge) you should expect to see:

benchmark_t5

To train with mp spawn:

python main.py

Or better, with torchrun:

torchrun --nnodes=1 --nproc_per_node=8 --rdzv_id=101 --rdzv_endpoint="localhost:5679" main_elastic.py  

You can control the model size, dataset size, batch size, etc. all in the config/defaults.py

About

housing our model example of fine tuning an 11B t5 with FSDP

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 82.1%
  • Jupyter Notebook 11.0%
  • Shell 5.2%
  • CMake 1.7%