Skip to content

Commit

Permalink
Release/1.1 (#174)
Browse files Browse the repository at this point in the history
* introduce different running modes: default, debug, experiment
* fix pytorch installation in setup_conda.sh
* fix incorrect calculation of precision, recall and f1 score in wandb callback
* add `_self_` to config.yaml for compatibility with hydra1.1
* fix setting seed in `train.py` so it's skipped when `seed=null`
* add exception message when trying to use wandb callbacks with `trainer.fast_dev_run=true`
* change `axis=-1` to `dim=-1` in LogImagePredictions callback
* add 'Reproducibilty' section to README.md
  • Loading branch information
Łukasz Zalewski authored Sep 28, 2021
1 parent 89b502b commit 86f30fb
Show file tree
Hide file tree
Showing 19 changed files with 131 additions and 81 deletions.
61 changes: 33 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ The directory structure of new project looks like this:
│ ├── datamodule <- Datamodule configs
│ ├── experiment <- Experiment configs
│ ├── hparams_search <- Hyperparameter search configs
│ ├── hydra <- Hydra related configs
│ ├── mode <- Running mode configs
│ ├── logger <- Logger configs
│ ├── model <- Model configs
│ ├── trainer <- Trainer configs
Expand Down Expand Up @@ -150,7 +150,6 @@ python run.py trainer.max_epochs=20 model.lr=1e-4
> You can also add new parameters with `+` sign.
```yaml
python run.py +model.new_param="uwu"

```

</details>
Expand Down Expand Up @@ -217,6 +216,21 @@ python run.py logger=wandb
</details>


<details>
<summary><b>Use different logging modes</b></summary>

```yaml
# debug mode changes logging folder to `logs/debug/`
python run.py mode=debug

# experiment mode changes logging folder to `logs/experiments/name_of_your_experiment/`
# also sets custom experiment name in the logger
python run.py mode=exp name='my_new_experiment_253'
```

</details>


<details>
<summary><b>Train model with chosen experiment config</b></summary>

Expand Down Expand Up @@ -269,7 +283,7 @@ python run.py +trainer.max_time="00:12:00:00"

```yaml
# run 1 train, val and test loop, using only 1 batch
python run.py debug=true
python run.py trainer.fast_dev_run=true

# print full weight summary of all PyTorch modules
python run.py trainer.weights_summary="full"
Expand Down Expand Up @@ -348,12 +362,12 @@ python run.py -m 'experiment=glob(*)'
</details>

<details>
<!-- <details>
<summary><b>Execute sweep on a SLURM cluster</b></summary>
> This should be achievable with either [the right lightning trainer flags](https://pytorch-lightning.readthedocs.io/en/latest/clouds/cluster.html?highlight=SLURM#slurm-managed-cluster) or simple config using [Submitit launcher for Hydra](https://hydra.cc/docs/plugins/submitit_launcher). Example is not yet implemented in this template.
</details>
</details> -->


<details>
Expand Down Expand Up @@ -433,7 +447,7 @@ defaults:
- callbacks: default.yaml # set this to null if you don't want to use callbacks
- logger: null # set logger here or use command line (e.g. `python run.py logger=wandb`)

- hydra: default.yaml
- mode: default.yaml

- experiment: null
- hparams_search: null
Expand Down Expand Up @@ -598,13 +612,13 @@ By default, logs have the following structure:
```

You can change this structure by modifying paths in [hydra configuration](configs/hydra/default.yaml).
You can change this structure by modifying paths in [hydra configuration](configs/mode).
<br><br>


### Experiment Tracking
PyTorch Lightning supports the most popular logging frameworks:<br>
**[Weights&Biases](https://www.wandb.com/) · [Neptune](https://neptune.ai/) · [Comet](https://www.comet.ml/) · [MLFlow](https://mlflow.org) · [Aim](https://github.com/aimhubio/aim) · [Tensorboard](https://www.tensorflow.org/tensorboard/)**
**[Weights&Biases](https://www.wandb.com/) · [Neptune](https://neptune.ai/) · [Comet](https://www.comet.ml/) · [MLFlow](https://mlflow.org) · [Tensorboard](https://www.tensorflow.org/tensorboard/)**

These tools help you keep track of hyperparameters and output metrics and allow you to compare and visualize results. To use one of them simply complete its configuration in [configs/logger](configs/logger) and run:
```yaml
Expand Down Expand Up @@ -684,7 +698,7 @@ hydra:
</details>

Next, you can execute it with: `python run.py -m hparams_search=mnist_optuna`<br>
Using this approach doesn't require you to add any boilerplate into your pipeline, everything is defined in a single config file. You can use different optimization frameworks integrated with Hydra, like Optuna, Ax or Nevergrad.
Using this approach doesn't require you to add any boilerplate into your pipeline, everything is defined in a single config file. You can use different optimization frameworks integrated with Hydra, like Optuna, Ax or Nevergrad. The `optimization_results.yaml` will be available under `logs/multirun` folder.
<br><br>


Expand Down Expand Up @@ -801,23 +815,14 @@ python run.py trainer.gpus=4 +trainer.accelerator="ddp"
<br><br>


### Extra Features
List of extra utilities available in the template:
- loading environment variables from [.env](.env.example) file
- pretty printing config with [Rich](https://github.com/willmcgugan/rich) library
- disabling python warnings
- debug mode
<!-- - (TODO) resuming latest run -->

You can easily remove any of those by modifying [run.py](run.py) and [src/train.py](src/train.py).
### Reproducibility
To reproduce previous experiment, simply load its config from logs:
```yaml
python run.py --config-path /logs/runs/.../.hydra/ --config-name config.yaml
```
The `config.yaml` from `.hydra` folder contains all overriden parameters and sections.
<br><br>

<!--
### Limitations
(TODO)
<br><br><br>
-->


## Best Practices
<!--<details>
Expand Down Expand Up @@ -1076,10 +1081,10 @@ eval "$(python run.py -sc install=bash)"
# enable aliases for debugging
alias test='pytest'
alias debug1='python run.py debug=true'
alias debug2='python run.py trainer.gpus=1 trainer.max_epochs=1'
alias debug3='python run.py trainer.gpus=1 trainer.max_epochs=1 +trainer.limit_train_batches=0.1'
alias debug_wandb='python run.py trainer.gpus=1 trainer.max_epochs=1 logger=wandb logger.wandb.project=tests'
alias debug1='python run.py mode=debug'
alias debug2='python run.py mode=debug trainer.fast_dev_run=false trainer.gpus=1 trainer.max_epochs=1'
alias debug3='python run.py mode=debug trainer.fast_dev_run=false trainer.gpus=1 trainer.max_epochs=1 +trainer.limit_train_batches=0.1'
alias debug_wandb='python run.py mode=debug trainer.fast_dev_run=false trainer.gpus=1 trainer.max_epochs=1 logger=wandb logger.wandb.project=tests'
```
(these commands will be executed whenever you're openning or switching terminal to folder containing `.autoenv` file)

Expand Down
37 changes: 22 additions & 15 deletions bash/setup_conda.sh
Original file line number Diff line number Diff line change
@@ -1,33 +1,40 @@
#!/bin/bash
# Run from root folder with: bash bash/setup_conda.sh

# check if conda is installed

# Check if conda is installed
if ! command -v conda &> /dev/null
then
echo "The 'conda' command could not be found. Exiting..."
exit
fi

# This line is needed for enabling conda env activation
source ~/miniconda3/etc/profile.d/conda.sh

# Configure conda env
read -rp "Enter environment name: " env_name
# Configure env
read -rp "Enter conda environment name: " env_name
read -rp "Enter python version (recommended '3.8') " python_version
read -rp "Enter cuda version (recommended '10.2', or 'none' for CPU only): " cuda_version
read -rp "Enter pytorch version (recommended '1.8.1'): " pytorch_version
read -rp "Enter cuda version ('10.2', '11.1' or 'none' for CPU only): " cuda_version


# Create conda env
# Create env
conda create -y -n "$env_name" python="$python_version"
conda activate "$env_name"

# Install pytorch

# Install pytorch + cuda
if [ "$cuda_version" == "none" ]; then
conda install -y pytorch=$pytorch_version torchvision torchaudio cpuonly -c pytorch
conda install -n "$env_name" -y pytorch torchvision torchaudio cpuonly -c pytorch
elif [ "$cuda_version" == "10.2" ]; then
conda install -n "$env_name" pytorch torchvision torchaudio cudatoolkit=$cuda_version -c pytorch
elif [ "$cuda_version" == "11.1" ]; then
conda install -n "$env_name" pytorch torchvision torchaudio cudatoolkit=$cuda_version -c pytorch -c nvidia
else
conda install -y pytorch=$pytorch_version torchvision torchaudio cudatoolkit=$cuda_version -c pytorch
echo "Incorrect cuda version. Exiting..."
exit
fi

echo "\n"
echo "To activate this environment, use:"

# Final message
echo "======================================="
echo "To activate this environment use:"
echo "conda activate $env_name"
echo "======================================="
echo -e "\a"
19 changes: 10 additions & 9 deletions configs/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,18 @@

# specify here default training configuration
defaults:
- _self_
- trainer: default.yaml
- model: mnist_model.yaml
- datamodule: mnist_datamodule.yaml
- callbacks: default.yaml # set this to null if you don't want to use callbacks
- callbacks: default.yaml
- logger: null # set logger here or use command line (e.g. `python run.py logger=wandb`)

- mode: default.yaml

- experiment: null
- hparams_search: null

- hydra: default.yaml

# enable color logging
- override hydra/hydra_logging: colorlog
- override hydra/job_logging: colorlog
Expand All @@ -26,12 +27,6 @@ work_dir: ${hydra:runtime.cwd}
# path to folder with data
data_dir: ${work_dir}/data/

# use `python run.py debug=true` for easy debugging!
# this will run 1 train, val and test loop with only 1 batch
# equivalent to running `python run.py trainer.fast_dev_run=true`
# (this is placed here just for easier access from command line)
debug: False

# pretty print config at the start of the run using Rich library
print_config: True

Expand All @@ -41,3 +36,9 @@ ignore_warnings: True
# check performance on test set, using the best model achieved during training
# lightning chooses best model based on metric specified in checkpoint callback
test_after_training: True

# seed for random number generators in pytorch, numpy and python.random
seed: null

# name of the run, accessed by loggers
name: null
2 changes: 1 addition & 1 deletion configs/experiment/example_simple.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# python run.py experiment=example_simple.yaml

defaults:
- override /trainer: default.yaml # choose trainer from 'configs/trainer/'
- override /trainer: default.yaml
- override /model: mnist_model.yaml
- override /datamodule: mnist_datamodule.yaml
- override /callbacks: default.yaml
Expand Down
1 change: 0 additions & 1 deletion configs/hparams_search/mnist_optuna.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
# example hyperparameter optimization of some experiment with Optuna:
# python run.py -m hparams_search=mnist_optuna experiment=example_simple
# python run.py -m hparams_search=mnist_optuna experiment=example_simple hydra.sweeper.n_trials=30
# python run.py -m hparams_search=mnist_optuna experiment=example_simple logger=wandb

defaults:
- override /hydra/sweeper: optuna
Expand Down
12 changes: 0 additions & 12 deletions configs/hydra/default.yaml

This file was deleted.

4 changes: 2 additions & 2 deletions configs/logger/comet.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@

comet:
_target_: pytorch_lightning.loggers.comet.CometLogger
api_key: ${oc.env:COMET_API_TOKEN} # api key is laoded from environment variable
api_key: ${oc.env:COMET_API_TOKEN} # api key is loaded from environment variable
project_name: "template-tests"
experiment_name: null
experiment_name: ${name}
2 changes: 1 addition & 1 deletion configs/logger/csv.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@ csv:
_target_: pytorch_lightning.loggers.csv_logs.CSVLogger
save_dir: "."
name: "csv/"
version: null
version: ${name}
prefix: ""
1 change: 0 additions & 1 deletion configs/logger/many_loggers.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# train with many loggers at once

defaults:
# - aim.yaml
# - comet.yaml
- csv.yaml
# - mlflow.yaml
Expand Down
2 changes: 1 addition & 1 deletion configs/logger/mlflow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

mlflow:
_target_: pytorch_lightning.loggers.mlflow.MLFlowLogger
experiment_name: default
experiment_name: ${name}
tracking_uri: null
tags: null
save_dir: ./mlruns
Expand Down
2 changes: 1 addition & 1 deletion configs/logger/neptune.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,6 @@ neptune:
project_name: your_name/template-tests
close_after_fit: True
offline_mode: False
experiment_name: null
experiment_name: ${name}
experiment_id: null
prefix: ""
2 changes: 1 addition & 1 deletion configs/logger/tensorboard.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ tensorboard:
_target_: pytorch_lightning.loggers.tensorboard.TensorBoardLogger
save_dir: "tensorboard/"
name: "default"
version: null
version: ${name}
log_graph: False
default_hp_metric: True
prefix: ""
2 changes: 1 addition & 1 deletion configs/logger/wandb.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
wandb:
_target_: pytorch_lightning.loggers.wandb.WandbLogger
project: "template-tests"
name: null
name: ${name}
save_dir: "."
offline: False # set True to store all logs only locally
id: null # pass correct id to resume experiment!
Expand Down
15 changes: 15 additions & 0 deletions configs/mode/debug.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# @package _global_

# run in debug mode with:
# `python run.py mode=debug`

# this flag doesn't really do anything
debug: true

# output paths for debug mode
hydra:
run:
dir: logs/debug/${now:%Y-%m-%d}/${now:%H-%M-%S}
sweep:
dir: logs/debug/multirun_${now:%Y-%m-%d_%H-%M-%S}
subdir: ${hydra.job.num}
11 changes: 11 additions & 0 deletions configs/mode/default.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# @package _global_

# default running mode

# default output paths for hydra logs
hydra:
run:
dir: logs/runs/${now:%Y-%m-%d}/${now:%H-%M-%S}
sweep:
dir: logs/multiruns/${now:%Y-%m-%d_%H-%M-%S}
subdir: ${hydra.job.num}
15 changes: 15 additions & 0 deletions configs/mode/exp.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# @package _global_

# run in experiment mode with:
# python run.py mode=exp name='my_new_experiment_23'

# allows for custom naming of the experiment
name: ???

# output paths for experiment mode
hydra:
run:
dir: logs/experiments/${name}
sweep:
dir: logs/experiments/${name}
subdir: ${hydra.job.num}
Loading

0 comments on commit 86f30fb

Please sign in to comment.