Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended settings for ScheduleFree. #1631

Closed
waomodder opened this issue Sep 22, 2024 · 4 comments
Closed

Recommended settings for ScheduleFree. #1631

waomodder opened this issue Sep 22, 2024 · 4 comments

Comments

@waomodder
Copy link

#1600
The ScheduleFree optimiser is used, but the initial Loss value is over 3000 and convergence is too slow.
Even after an hour, I cannot even reach the decimal point (0.3) when using AdamW.
Can you please tell me if there are any recommended settings for use?

command
accelerate launch --num_cpu_threads_per_process 20 flux_train_network.py --pretrained_model_name_or_path "D:\ComfyUI_windows_portable\ComfyUI\models\unet\flux1devpro2.safetensors" --train_data_dir "D:\Lora_learning\Data\asset\super_robot_diffusion_F" --output_dir "D:\Lora_learning\Data\output" --network_module "networks.lora_flux" --gradient_checkpointing --persistent_data_loader_workers --cache_latents --cache_latents_to_disk --max_data_loader_n_workers 12 --enable_bucket --save_model_as "safetensors" --lr_scheduler_num_cycles 4 --mixed_precision "bf16" --resolution 1024 --train_batch_size 1 --max_train_epochs 10 --network_dim 32 --network_alpha 256.0 --save_every_n_epochs 1 --save_every_n_steps 250 --optimizer_type "adamwschedulefree" --output_name "SRD_F_v05_t11" --ae "D:\ComfyUI_windows_portable\ComfyUI\models\vae\ae.safetensors" --bucket_no_upscale --save_precision "fp16" --min_bucket_reso 320 --max_bucket_reso 2048 --caption_extension ".txt" --seed 42 --fp8_base --highvram --loss_type "l2" --huber_schedule "snr" --gradient_accumulation_steps 2 --timestep_sampling flux_shift --model_prediction_type "raw" --guidance_scale 1 --clip_l "D:\stable-diffusion-webui\models\CLIP\clip_l.safetensors" --t5xxl "D:\stable-diffusion-webui\models\CLIP\t5xxl_fp16.safetensors" --sdpa --cache_text_encoder_outputs --cache_text_encoder_outputs_to_disk --network_weights "D:\Lora_learning\Data\output\SRD_F_v05_t10-000008.safetensors"

image

@recris
Copy link

recris commented Sep 24, 2024

You need to provide the base learning_rate, same as before.

Also your network_alpha seems crazy high, you should use the same value as network_dim or lower.

Plus huber_schedule is ignored when loss_type is not "huber" (Huber loss is currently not supported in Flux).

@waomodder
Copy link
Author

@recris
Thanks for pointing this out. We would like to know if there is an ‘adamwschedulefree’ setting that you think is best.

@recris
Copy link

recris commented Sep 25, 2024

I typically start with network_dim = 16, network_alpha = 8 and learning_rate = 2e-4 then tweak the LR from there.

I also recommend training at a lower resolution first (like 640px) while experimenting with different parameters, it's much quicker while figuring the what the optimal settings are.

@waomodder
Copy link
Author

Thank you for your detailed guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants