Skip to content

Commit

Permalink
YAMLS for MPT runs inherit global max_seq_len in model config (#409)
Browse files Browse the repository at this point in the history
* mpt configs inherit global max_seq_len in YAML

* update hf_eval yaml with max_seq_len override

---------

Co-authored-by: Vitaliy Chiley <[email protected]>
  • Loading branch information
alextrott16 and vchiley committed Jul 1, 2023
1 parent 37bf6f5 commit 5c14661
Show file tree
Hide file tree
Showing 4 changed files with 5 additions and 0 deletions.
1 change: 1 addition & 0 deletions TUTORIAL.md
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ Now that we have our data ready, we can slightly modify `scripts/train/yamls/fin
```bash
composer scripts/train/train.py scripts/train/yamls/finetune/mpt-7b_domain_adapt.yaml max_seq_len=4096 ...
```
> Note that this override where we set `max_seq_len=4096` in the above command works because of how the whole YAML is set up. Importantly, the YAML is configured with `model.config_overrides.max_seq_len: ${max_seq_len}`, which tells the MPT model to override its default max sequence length with the value set for `max_seq_len`.

You will see some info logs including your configs, and then training will start.

Expand Down
2 changes: 2 additions & 0 deletions scripts/eval/yamls/hf_eval.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ models:
# pretrained_model_name_or_path: mosaicml/mpt-7b
# init_device: cpu
# pretrained: true
# config_overrides:
# max_seq_len: ${max_seq_len}
# tokenizer:
# name: mosaicml/mpt-7b
# kwargs:
Expand Down
1 change: 1 addition & 0 deletions scripts/train/finetune_example/mpt-7b-arc-easy--gpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ model:
pretrained_model_name_or_path: mosaicml/mpt-7b
pretrained: true # false: only use the architecture; true: initialize with pretrained weights
config_overrides:
max_seq_len: ${max_seq_len}
attn_config:
attn_impl: triton
# Set this to `true` if using `train_loader.dataset.packing_ratio` below
Expand Down
1 change: 1 addition & 0 deletions scripts/train/yamls/finetune/mpt-7b_dolly_sft.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ model:
pretrained: true
pretrained_model_name_or_path: mosaicml/mpt-7b
config_overrides:
max_seq_len: ${max_seq_len}
attn_config:
attn_impl: triton
# Set this to `true` if using `train_loader.dataset.packing_ratio` below
Expand Down

0 comments on commit 5c14661

Please sign in to comment.