clarified in_memory, also cleaned up setup page a bit

mlcommons · Aug 31, 2024 · c8c49d3 · c8c49d3
1 parent b7c9930
commit c8c49d3
Show file tree

Hide file tree

Showing 3 changed files with 56 additions and 39 deletions.
diff --git a/GANDLF/metrics/classification.py b/GANDLF/metrics/classification.py
@@ -94,11 +94,11 @@ def __convert_tensor_to_int(input_tensor: torch.Tensor) -> torch.Tensor:
                 num_classes=params["model"]["num_classes"],
                 average=average_type_key,
             ),
-            f"auroc_{average_type}": tm.AUROC(
-                task=task,
-                num_classes=params["model"]["num_classes"],
-                average=average_type_key if average_type_key != "micro" else "macro",
-            ),
+            # f"auroc_{average_type}": tm.AUROC(
+            #     task=task,
+            #     num_classes=params["model"]["num_classes"],
+            #     average=average_type_key if average_type_key != "micro" else "macro",
+            # ),
         }
         for metric_name, calculator in calculators.items():
             if "auroc" in metric_name:

diff --git a/docs/customize.md b/docs/customize.md
@@ -119,17 +119,17 @@ This file contains mid-level information regarding various parameters that can b
 
 - These are various parameters that control the overall training process.
 - `verbose`: generate verbose messages on console; generally used for debugging.
-- `batch_size`: defines the batch size to be used for training.
-- `in_memory`: this is to enable or disable lazy loading - setting to true reads all data once during data loading, resulting in improvements.
-- `num_epochs`: defines the number of epochs to train for.
-- `patience`: defines the number of epochs to wait for improvement before early stopping.
-- `learning_rate`: defines the learning rate to be used for training.
-- `scheduler`: defines the learning rate scheduler to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/schedulers/__init__.py); can take the following sub-parameters:
+- `batch_size`: batch size to be used for training.
+- `in_memory`: this is to enable or disable lazy loading. If set to True, all data is loaded on RAM once during data loading, resulting in faster training. If set to False, data gets read into RAM on-the-go when needed, which slows down training but lessens the memory load. The latter is recommended if the user's RAM has limited capacity.
+- `num_epochs`: number of epochs to train for.
+- `patience`: number of epochs to wait for improvement before early stopping.
+- `learning_rate`: learning rate to be used for training.
+- `scheduler`: learning rate scheduler to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/schedulers/__init__.py); can take the following sub-parameters:
     - `type`: `triangle`, `triangle_modified`, `exp`, `step`, `reduce-on-plateau`, `cosineannealing`, `triangular`, `triangular2`, `exp_range`
-    - `min_lr`: defines the minimum learning rate to be used for training.
-    - `max_lr`: defines the maximum learning rate to be used for training.
-- `optimizer`: defines the optimizer to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/optimizers/__init__.py).
-- `nested_training`: defines the number of folds to use nested training, takes `testing` and `validation` as sub-parameters, with integer values defining the number of folds to use.
+    - `min_lr`: minimum learning rate to be used for training.
+    - `max_lr`: maximum learning rate to be used for training.
+- `optimizer`: optimizer to be used for training, more details are [here](https://github.com/mlcommons/GaNDLF/blob/master/GANDLF/optimizers/__init__.py).
+- `nested_training`: number of folds to use nested training, takes `testing` and `validation` as sub-parameters, with integer values defining the number of folds to use.
 - `memory_save_mode`: if enabled, resize/resample operations in `data_preprocessing` will save files to disk instead of directly getting read into memory as tensors
 - **Queue configuration**: this defines how the queue for the input to the model is to be designed **after** the [patching strategy](#patching-strategy) has been applied, and more details are [here](https://torchio.readthedocs.io/data/patch_training.html?#queue). This takes the following sub-parameters:
     - `q_max_length`: his determines the maximum number of patches that can be stored in the queue. Using a large number means that the queue needs to be filled less often, but more CPU memory is needed to store the patches.

diff --git a/docs/setup.md b/docs/setup.md
@@ -22,64 +22,81 @@ Alternatively, you can run GaNDLF via [Docker](https://www.docker.com/). This ne
 
 ### Install PyTorch 
 
-GaNDLF's primary computational foundation is built on PyTorch, and as such it supports all hardware types that PyTorch supports. Please install PyTorch for your hardware type before installing GaNDLF. See the [PyTorch installation instructions](https://pytorch.org/get-started/previous-versions/#v1131) for more details. An example installation using CUDA, ROCm, and CPU-only is shown below:
+GaNDLF's primary computational foundation is built on PyTorch, and as such it supports all hardware types that PyTorch supports. Please install PyTorch for your hardware type before installing GaNDLF. See the [PyTorch installation instructions](https://pytorch.org/get-started/previous-versions/#v1131) for more details. 
 
+
+First, instantiate your environment
 ```bash
 (base) $> conda create -n venv_gandlf python=3.9 -y
 (base) $> conda activate venv_gandlf
 (venv_gandlf) $> ### subsequent commands go here
-### PyTorch installation - https://pytorch.org/get-started/previous-versions/#v210
-## CUDA 12.1
-# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
-## CUDA 11.8
-# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
-## ROCm 5.7
-# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7
-## CPU-only
-# (venv_gandlf) $> pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
+```
+
+You may install pytorch to be compatible with CUDA, ROCm, or CPU-only. An exhaustive list of PyTorch installations can be found here: https://pytorch.org/get-started/previous-versions/#v210
+Use one of the following depending on your needs:
+- CUDA 12.1
+```bash
+ pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
+```
+- CUDA 11.8
+```bash
+ pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
+```
+- ROCm 5.7
+```bash
+ pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7
+```
+- CPU-only
+```bash
+ pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
 ```
 
 ### Optional Dependencies 
 
 The following dependencies are **optional**, and are only needed to access specific features of GaNDLF.
 
 ```bash
-(venv_gandlf) $> pip install openvino-dev==2023.0.1 # [OPTIONAL] to generate post-training optimized models for inference
-(venv_gandlf) $> pip install mlcube_docker # [OPTIONAL] to deploy GaNDLF models as MLCube-compliant Docker containers
+pip install openvino-dev==2023.0.1 # [OPTIONAL] to generate post-training optimized models for inference
+pip install mlcube_docker # [OPTIONAL] to deploy GaNDLF models as MLCube-compliant Docker containers
 ```
 
 ### Install from Package Managers
 
 This option is recommended for most users, and allows for the quickest way to get started with GaNDLF.
 
 ```bash
-# continue from previous shell
-(venv_gandlf) $> pip install gandlf # this will give you the latest stable release
-## you can also use conda
-# (venv_gandlf) $> conda install -c conda-forge gandlf -y
+pip install gandlf # this will give you the latest stable release
+```
+You can also use conda
+```bash
+conda install -c conda-forge gandlf -y
 ```
 
 If you are interested in running the latest version of GaNDLF, you can install the nightly build by running the following command:
 
 ```bash
-# continue from previous shell
-(venv_gandlf) $> pip install --pre gandlf
-## you can also use conda
-# (venv_gandlf) $> conda install -c conda-forge/label/gandlf_dev -c conda-forge gandlf -y
+pip install --pre gandlf
 ```
 
+You can also use conda
+```bash
+conda install -c conda-forge/label/gandlf_dev -c conda-forge gandlf -y
+```
 
 ### Install from Sources
 
 Use this option if you want to [contribute to GaNDLF](https://github.com/mlcommons/GaNDLF/blob/master/CONTRIBUTING.md), or are interested to make other code-level changes for your own use.
 
 ```bash
-# continue from previous shell
-(venv_gandlf) $> git clone https://github.com/mlcommons/GaNDLF.git
-(venv_gandlf) $> cd GaNDLF
-(venv_gandlf) $> pip install -e .
+git clone https://github.com/mlcommons/GaNDLF.git
+cd GaNDLF
+pip install -e .
 ```
 
+Test your installation:
+```bash
+gandlf --version
+```
 
 ## Docker Installation