Large scale hyperparameter searches using Lightning + Hydra + MLFlow + SubmitIt. We only need to change the one line of code -- the hyperparameters themselves -- and sit back while the code automatically searches. Automatically submits any jobs necessary, all updates are posted to a central MLflow server. The end result will be the MLFlow UI, in which we can track live all the updates of the search. In case that we enable an automatic sweeper (Ax is included as an example in this repo), we can of course follow in real time what parameters are being tested.
Clone + install dependencies. Create a .env
file containing a DATA_DIR
specifying where to find/download the MNIST dataset, and a MAIN_CONFIG
variable being either grid
or bayesian
. The grid
can be used for grid searches or 'normal' training without any hyperparameter searches at all. The bayesian
config includes by default an 'Ax' sweeper that will try to optimize the configured metric ('val/acc_best' by default).
Note: due to a bug in plugin discovery with hydra, one should use / adapt hydra to the pull request here (1 line of code fixes it): facebookresearch/hydra#2019
The most convenient approach is to use a bash script and submit it to slurm via sbatch
, see also the explanation written in the /scripts folder
The Lightning and Hydra parts of this template are largely based upon https://github.com/ashleve/lightning-hydra-template. If you prefer yaml configs instead of the structured ones used in this project, they can be found there as well.