-
Notifications
You must be signed in to change notification settings - Fork 841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] Decouple linear1
and linear2
Flux layers in network_args
#1613
Comments
+1 |
This is interesting. Since linear1 and 2 belong to the same attention, I don't think there is any need to separate them. I think we can get almost the same effect by training linear1 and 2 with half the dim(rank). Have you tried it? |
No, but I'll give it a go! I modified my copy of |
Any chance you could post the patch? |
My sd-scripts is heavily customized, but here's how you can apply it to yours:
sd-scripts/networks/lora_flux.py Line 665 in 95ff9db
Hope that helps. I'm still in the process of running some tests on training |
Hi,
There's a popular discussion thread that suggests training the
proj_out
(linear2)
module of single blocks 7 and 20 for Flux LoRAs:https://old.reddit.com/r/StableDiffusion/comments/1f523bd/good_flux_loras_can_be_less_than_45mb_128_dim/
As far as I can tell, it is not yet possible to isolate
linear2
through the sd-scriptsnetwork_args
flag. Perhaps this is as close as it gets:I propose replacing the
single_dim
layer with e.g.single_linear1_dim
andsingle_linear2_dim
. That way, we can specifysingle_linear1_dim=0
to reproduce the training method outlined in the thread above.Or is this already possible with a different set of arguments?
Thanks!
The text was updated successfully, but these errors were encountered: