Skip to content

Commit

Permalink
clarified in_memory, also cleaned up setup page a bit
Browse files Browse the repository at this point in the history
  • Loading branch information
Linardos committed Aug 31, 2024
1 parent b7c9930 commit c8c49d3
Show file tree
Hide file tree
Showing 3 changed files with 56 additions and 39 deletions.
10 changes: 5 additions & 5 deletions GANDLF/metrics/classification.py
Original file line number Diff line number Diff line change
Expand Up @@ -94,11 +94,11 @@ def __convert_tensor_to_int(input_tensor: torch.Tensor) -> torch.Tensor:
num_classes=params["model"]["num_classes"],
average=average_type_key,
),
f"auroc_{average_type}": tm.AUROC(
task=task,
num_classes=params["model"]["num_classes"],
average=average_type_key if average_type_key != "micro" else "macro",
),
# f"auroc_{average_type}": tm.AUROC(
# task=task,
# num_classes=params["model"]["num_classes"],
# average=average_type_key if average_type_key != "micro" else "macro",
# ),
}
for metric_name, calculator in calculators.items():
if "auroc" in metric_name:
Expand Down
20 changes: 10 additions & 10 deletions docs/customize.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,17 +119,17 @@ This file contains mid-level information regarding various parameters that can b

- These are various parameters that control the overall training process.
- `verbose`: generate verbose messages on console; generally used for debugging.
- `batch_size`: defines the batch size to be used for training.
- `in_memory`: this is to enable or disable lazy loading - setting to true reads all data once during data loading, resulting in improvements.
- `num_epochs`: defines the number of epochs to train for.
- `patience`: defines the number of epochs to wait for improvement before early stopping.
- `learning_rate`: defines the learning rate to be used for training.
- `scheduler`: defines the learning rate scheduler to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/schedulers/__init__.py); can take the following sub-parameters:
- `batch_size`: batch size to be used for training.
- `in_memory`: this is to enable or disable lazy loading. If set to True, all data is loaded on RAM once during data loading, resulting in faster training. If set to False, data gets read into RAM on-the-go when needed, which slows down training but lessens the memory load. The latter is recommended if the user's RAM has limited capacity.
- `num_epochs`: number of epochs to train for.
- `patience`: number of epochs to wait for improvement before early stopping.
- `learning_rate`: learning rate to be used for training.
- `scheduler`: learning rate scheduler to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/schedulers/__init__.py); can take the following sub-parameters:
- `type`: `triangle`, `triangle_modified`, `exp`, `step`, `reduce-on-plateau`, `cosineannealing`, `triangular`, `triangular2`, `exp_range`
- `min_lr`: defines the minimum learning rate to be used for training.
- `max_lr`: defines the maximum learning rate to be used for training.
- `optimizer`: defines the optimizer to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/optimizers/__init__.py).
- `nested_training`: defines the number of folds to use nested training, takes `testing` and `validation` as sub-parameters, with integer values defining the number of folds to use.
- `min_lr`: minimum learning rate to be used for training.
- `max_lr`: maximum learning rate to be used for training.
- `optimizer`: optimizer to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/optimizers/__init__.py).
- `nested_training`: number of folds to use nested training, takes `testing` and `validation` as sub-parameters, with integer values defining the number of folds to use.
- `memory_save_mode`: if enabled, resize/resample operations in `data_preprocessing` will save files to disk instead of directly getting read into memory as tensors
- **Queue configuration**: this defines how the queue for the input to the model is to be designed **after** the [patching strategy](#patching-strategy) has been applied, and more details are [here](https://torchio.readthedocs.io/data/patch_training.html?#queue). This takes the following sub-parameters:
- `q_max_length`: his determines the maximum number of patches that can be stored in the queue. Using a large number means that the queue needs to be filled less often, but more CPU memory is needed to store the patches.
Expand Down
65 changes: 41 additions & 24 deletions docs/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,64 +22,81 @@ Alternatively, you can run GaNDLF via [Docker](https://www.docker.com/). This ne

### Install PyTorch

GaNDLF's primary computational foundation is built on PyTorch, and as such it supports all hardware types that PyTorch supports. Please install PyTorch for your hardware type before installing GaNDLF. See the [PyTorch installation instructions](https://pytorch.org/get-started/previous-versions/#v1131) for more details. An example installation using CUDA, ROCm, and CPU-only is shown below:
GaNDLF's primary computational foundation is built on PyTorch, and as such it supports all hardware types that PyTorch supports. Please install PyTorch for your hardware type before installing GaNDLF. See the [PyTorch installation instructions](https://pytorch.org/get-started/previous-versions/#v1131) for more details.


First, instantiate your environment
```bash
(base) $> conda create -n venv_gandlf python=3.9 -y
(base) $> conda activate venv_gandlf
(venv_gandlf) $> ### subsequent commands go here
### PyTorch installation - https://pytorch.org/get-started/previous-versions/#v210
## CUDA 12.1
# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
## CUDA 11.8
# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
## ROCm 5.7
# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7
## CPU-only
# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
```

You may install pytorch to be compatible with CUDA, ROCm, or CPU-only. An exhaustive list of PyTorch installations can be found here: https://pytorch.org/get-started/previous-versions/#v210
Use one of the following depending on your needs:
- CUDA 12.1
```bash
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
```
- CUDA 11.8
```bash
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
```
- ROCm 5.7
```bash
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7
```
- CPU-only
```bash
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
```

### Optional Dependencies

The following dependencies are **optional**, and are only needed to access specific features of GaNDLF.

```bash
(venv_gandlf) $> pip install openvino-dev==2023.0.1 # [OPTIONAL] to generate post-training optimized models for inference
(venv_gandlf) $> pip install mlcube_docker # [OPTIONAL] to deploy GaNDLF models as MLCube-compliant Docker containers
pip install openvino-dev==2023.0.1 # [OPTIONAL] to generate post-training optimized models for inference
pip install mlcube_docker # [OPTIONAL] to deploy GaNDLF models as MLCube-compliant Docker containers
```

### Install from Package Managers

This option is recommended for most users, and allows for the quickest way to get started with GaNDLF.

```bash
# continue from previous shell
(venv_gandlf) $> pip install gandlf # this will give you the latest stable release
## you can also use conda
# (venv_gandlf) $> conda install -c conda-forge gandlf -y
pip install gandlf # this will give you the latest stable release
```
You can also use conda
```bash
conda install -c conda-forge gandlf -y
```

If you are interested in running the latest version of GaNDLF, you can install the nightly build by running the following command:

```bash
# continue from previous shell
(venv_gandlf) $> pip install --pre gandlf
## you can also use conda
# (venv_gandlf) $> conda install -c conda-forge/label/gandlf_dev -c conda-forge gandlf -y
pip install --pre gandlf
```

You can also use conda
```bash
conda install -c conda-forge/label/gandlf_dev -c conda-forge gandlf -y
```

### Install from Sources

Use this option if you want to [contribute to GaNDLF](https://github.com/mlcommons/GaNDLF/blob/master/CONTRIBUTING.md), or are interested to make other code-level changes for your own use.

```bash
# continue from previous shell
(venv_gandlf) $> git clone https://github.com/mlcommons/GaNDLF.git
(venv_gandlf) $> cd GaNDLF
(venv_gandlf) $> pip install -e .
git clone https://github.com/mlcommons/GaNDLF.git
cd GaNDLF
pip install -e .
```

Test your installation:
```bash
gandlf --version
```

## Docker Installation

Expand Down

0 comments on commit c8c49d3

Please sign in to comment.