Minimal training setup #146

sachit-menon · 2023-04-01T04:14:33Z

Hi! Could you share the smallest/minimal setup you trained to get signs of life before the full 7B run? (Maybe something with the OPT 1.3B like the README suggests?)

anas-awadalla · 2023-04-01T09:18:53Z

Yep we did do initial runs using OPT 1.3B and ViT-L-14.

We used the following hyper-parameters (which are exactly the same as the README apart from the batch size):

Batch size (LAION 2B): 512
Batch size (MMC4): 256
loss_multiplier_laion: 0.2
lr_scheduler: constant
warmup_steps: 5000
mmc4_textsim_threshold: 30
use_media_placement_augmentation

We see 'signs of life' (relevant predictions for images etc.) after ~10k steps (which is around 5M samples seen of LAION 2B). You should also be able to get signs of life by training only on LAION 2B (there is a PR out to do so although I haven't tested it out yet). I am not sure what is the exact number of steps you should train for if you are just training on LAION. Lmk if there is any other details I can provide!

ccliu2 · 2023-04-04T20:17:21Z

Thank you very much for the amazing work, I am wondering if it is possible to make the trained OPT-1.3B model available?

anas-awadalla · 2023-04-04T21:40:23Z

@ccliu2 Thanks for your interest! Yes we can release a 1.3B model but it will probably be in our next release as we want to make sure performance is on par with DeepMind's version.

i-gao · 2023-06-30T16:11:35Z

We've released a 3B model using MPT-1B; we ended up finding that MPT-1B had stronger performance than an OPT-1.3B backbone. Closing this for now. Thanks for your interest!

i-gao closed this as completed Jun 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Minimal training setup #146

Minimal training setup #146

sachit-menon commented Apr 1, 2023

anas-awadalla commented Apr 1, 2023 •

edited

Loading

ccliu2 commented Apr 4, 2023

anas-awadalla commented Apr 4, 2023 •

edited

Loading

i-gao commented Jun 30, 2023

Minimal training setup #146

Minimal training setup #146

Comments

sachit-menon commented Apr 1, 2023

anas-awadalla commented Apr 1, 2023 • edited Loading

ccliu2 commented Apr 4, 2023

anas-awadalla commented Apr 4, 2023 • edited Loading

i-gao commented Jun 30, 2023

anas-awadalla commented Apr 1, 2023 •

edited

Loading

anas-awadalla commented Apr 4, 2023 •

edited

Loading