Skip to content

Commit

Permalink
Adding some notes for resuming converted FP8 checkpoints
Browse files Browse the repository at this point in the history
Signed-off-by: Ming Huang <[email protected]>
  • Loading branch information
mingxu1067 committed May 23, 2024
1 parent ebc9745 commit c46cded
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions rosetta/utils/te_pax_t5x_ckpt_converter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,10 @@ python converter/main.py \
--mlp-intermediate-dim=1024
```
NOTE:
For the generated FP8 meta, only the amax of weights is accurate. Therefore, please be aware that a few steps for adjusting FP8 meta
of inputs and gradients are needed when resuming training with the converted FP8 checkpoints.
#### The folder structure of CKPT by Pax and T5X
If you would like to run the converted CKPTs with frameworks, you may expect the converted CKPTs have the same folder
structure with CKPTs stored by frameworks. In this case, you could set `--output-path` to be the same stucture as the
Expand Down

0 comments on commit c46cded

Please sign in to comment.