Edit tutorial comments on PEFT / LoRA #416

vchiley · 2023-07-03T00:45:46Z

It doesn't really address this issue but it allows us to note that FSDP + LoRA not working is a known issue.

danbider · 2023-07-03T15:37:38Z

had a hard time pushing my commit through the spaces: here's my suggestion

The LLM Foundry codebase does not directly have examples of PEFT or LORA workflows. However, our MPT model is a subclass of HuggingFace PretrainedModel, and Feature/peft compatible models #346 added required features to enable HuggingFace’s PEFT / LORA workflows for MPT. MPT models with LoRA modules can be trained either using LLM Foundry or Hugging Face's accelerate. Within LLM Foundry, run (scripts/train/train.py), adding lora arguments to the config .yaml, like so:

lora:
  args:
    r: 16
    lora_alpha: 32
    lora_dropout: 0.05
    target_modules: ['Wqkv']

In the current release, these features have Beta support.
For efficiency, The MPT model concatenates the Q, K, and V matrices in each attention block into a single Wqkv matrix that is three times wider. Currently, LoRA supports a low-rank approximation to this Wqkv matrix.
Known issue: PEFT / LoRA do not directly work with FSDP.

codestar12

LGTM

codestar12

LGTM

SeanTech99 · 2023-08-12T16:00:21Z

may I know how to use ddp config here in the yaml?

germanjke · 2023-08-14T15:35:31Z

add more comments to the peft lora section in the tutorial

eccf830

vchiley requested review from alextrott16, codestar12 and danbider July 3, 2023 00:45

vchiley self-assigned this Jul 3, 2023

updt cmts

8eda02e

codestar12 reviewed Jul 3, 2023

View reviewed changes

codestar12 approved these changes Jul 3, 2023

View reviewed changes

vchiley merged commit 4e6a878 into mosaicml:main Jul 3, 2023
9 checks passed

chris-aeviator mentioned this pull request Jul 17, 2023

Feature/peft compatible models #346

Merged

vchiley deleted the lora_cmts branch November 9, 2023 22:19

Provide feedback