Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specify blocks to Train w/ Finetuning? #1636

Open
setothegreat opened this issue Sep 24, 2024 · 1 comment
Open

Specify blocks to Train w/ Finetuning? #1636

setothegreat opened this issue Sep 24, 2024 · 1 comment

Comments

@setothegreat
Copy link

Since it appears that Flux LoRA training can still be effective when only training specific layers, I am wondering if this functionality can be expanded to Finetuning, since this is where the biggest roadblocks pertaining to speed and hardware currently lie. Rather than being limited to Adafactor and dozens of hours per training iteration, being able to specify a subset of layers to train seems like it should lower hardware requirements, thus allowing the use of potentially more efficient optimizers on consumer-grade hardware, and could bring the training time down by an order of magnitude.

Is there some sort of architecture-level roadblock that prevents specific layer training when doing a full finetune that doesn't exist when training a LoRA that I'm not aware of?

@kohya-ss
Copy link
Owner

The model parameters need to be stored in VRAM in bf16, which consumes 22GB of VRAM (block swap is implemented to reduce that). Therefore, training some layers will not help much in reducing VRAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants