You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Since it appears that Flux LoRA training can still be effective when only training specific layers, I am wondering if this functionality can be expanded to Finetuning, since this is where the biggest roadblocks pertaining to speed and hardware currently lie. Rather than being limited to Adafactor and dozens of hours per training iteration, being able to specify a subset of layers to train seems like it should lower hardware requirements, thus allowing the use of potentially more efficient optimizers on consumer-grade hardware, and could bring the training time down by an order of magnitude.
Is there some sort of architecture-level roadblock that prevents specific layer training when doing a full finetune that doesn't exist when training a LoRA that I'm not aware of?
The text was updated successfully, but these errors were encountered:
The model parameters need to be stored in VRAM in bf16, which consumes 22GB of VRAM (block swap is implemented to reduce that). Therefore, training some layers will not help much in reducing VRAM.
Since it appears that Flux LoRA training can still be effective when only training specific layers, I am wondering if this functionality can be expanded to Finetuning, since this is where the biggest roadblocks pertaining to speed and hardware currently lie. Rather than being limited to Adafactor and dozens of hours per training iteration, being able to specify a subset of layers to train seems like it should lower hardware requirements, thus allowing the use of potentially more efficient optimizers on consumer-grade hardware, and could bring the training time down by an order of magnitude.
Is there some sort of architecture-level roadblock that prevents specific layer training when doing a full finetune that doesn't exist when training a LoRA that I'm not aware of?
The text was updated successfully, but these errors were encountered: