Skip to content

Commit

Permalink
cleaned up set up, added in_memory explanation
Browse files Browse the repository at this point in the history
  • Loading branch information
Linardos committed Aug 31, 2024
2 parents c8c49d3 + 41a0705 commit 0c2c1da
Show file tree
Hide file tree
Showing 31 changed files with 173 additions and 36 deletions.
2 changes: 1 addition & 1 deletion .devcontainer/onCreateCommand.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,4 @@ pip install wheel
pip install openvino-dev==2023.0.1 # [OPTIONAL] to generate optimized models for inference
pip install mlcube_docker # [OPTIONAL] to deploy GaNDLF models as MLCube-compliant Docker containers
pip install medmnist==2.1.0
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cpu
2 changes: 1 addition & 1 deletion .devcontainer/postCreateCommand.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
# if runnning on a GPU machine, install the GPU version of pytorch
if command -v nvidia-smi &> /dev/null
then
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
fi

pip install -e .
Expand Down
1 change: 1 addition & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ Note that if a box is left unchecked, PR merges will take longer than usual.
- [ ] [Usage documentation](https://github.com/mlcommons/GaNDLF/blob/master/docs) has been updated, if appropriate.
- [ ] Tests added or modified to [cover the changes](https://app.codecov.io/gh/mlcommons/GaNDLF); if coverage is reduced, please give explanation.
- [ ] If customized dependency installation is required (i.e., a separate `pip install` step is needed for PR to be functional), please ensure it is reflected in all the files that control the CI, namely: [python-test.yml](https://github.com/mlcommons/GaNDLF/blob/master/.github/workflows/python-test.yml), and all docker files [[1](https://github.com/mlcommons/GaNDLF/blob/master/Dockerfile-CPU),[2](https://github.com/mlcommons/GaNDLF/blob/devcontainer_build_fix/Dockerfile-CUDA11.6),[3](https://github.com/mlcommons/GaNDLF/blob/master/Dockerfile-ROCm)].
- [ ] The `logging` library is being used and no `print` statements are left.
2 changes: 1 addition & 1 deletion .github/workflows/mlcube-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ jobs:
python -m pip install --upgrade pip==24.0
python -m pip install wheel
python -m pip install openvino-dev==2023.0.1 mlcube_docker
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cpu
pip install -e .
- name: Run mlcube deploy tests
working-directory: ./testing
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/openfl-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ jobs:
sudo apt-get install libvips libvips-tools -y
python -m pip install --upgrade pip==24.0
python -m pip install wheel
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cpu
pip install -e .
- name: Run generic unit tests to download data and construct CSVs
if: steps.changed-files-specific.outputs.only_modified == 'false' # Run on any non-docs change
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/python-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ jobs:
python -m pip install --upgrade pip==24.0
python -m pip install wheel
python -m pip install openvino-dev==2023.0.1 mlcube_docker
pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cpu
pip install -e .
- name: Run generic unit tests
if: steps.changed-files-specific.outputs.only_modified == 'false' # Run on any non-docs change
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-CPU
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y python3.9 python3-pip libjpeg8-dev zlib1g-dev python3-dev libpython3.9-dev libffi-dev libgl1
RUN python3.9 -m pip install --upgrade pip==24.0
# EXPLICITLY install cpu versions of torch/torchvision (not all versions have +cpu modes on PyPI...)
RUN python3.9 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cpu
RUN python3.9 -m pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cpu
RUN python3.9 -m pip install openvino-dev==2023.0.1 opencv-python-headless mlcube_docker

# Do some dependency installation separately here to make layer caching more efficient
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-CUDA11.8
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y python3.9 python3-pip libjpeg8-dev zlib1g-dev python3-dev libpython3.9-dev libffi-dev libgl1
RUN python3.9 -m pip install --upgrade pip==24.0
RUN python3.9 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu118
RUN python3.9 -m pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
RUN python3.9 -m pip install openvino-dev==2023.0.1 opencv-python-headless mlcube_docker

# Do some dependency installation separately here to make layer caching more efficient
Expand Down
2 changes: 1 addition & 1 deletion Dockerfile-CUDA12.1
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y python3.9 python3-pip libjpeg8-dev zlib1g-dev python3-dev libpython3.9-dev libffi-dev libgl1
RUN python3.9 -m pip install --upgrade pip==24.0
RUN python3.9 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121
RUN python3.9 -m pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu121
RUN python3.9 -m pip install openvino-dev==2023.0.1 opencv-python-headless mlcube_docker

# Do some dependency installation separately here to make layer caching more efficient
Expand Down
4 changes: 2 additions & 2 deletions Dockerfile-ROCm
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
FROM rocm/pytorch:rocm5.7_ubuntu20.04_py3.9_pytorch_2.0.1
FROM rocm/pytorch:rocm6.0_ubuntu20.04_py3.9_pytorch
LABEL github="https://github.com/mlcommons/GaNDLF"
LABEL docs="https://mlcommons.github.io/GaNDLF/"
LABEL version=1.0
Expand All @@ -10,7 +10,7 @@ RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:deadsnakes/ppa
RUN apt-get update && apt-get install -y python3.9 python3-pip libjpeg8-dev zlib1g-dev python3-dev libpython3.9-dev libffi-dev libgl1
RUN python3.9 -m pip install --upgrade pip==24.0
RUN python3.9 -m pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/rocm5.7
RUN python3.9 -m pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/rocm6.0
RUN python3.9 -m pip install --upgrade pip && python3.9 -m pip install openvino-dev==2023.0.1 opencv-python-headless mlcube_docker
RUN apt-get update && apt-get install -y libgl1

Expand Down
15 changes: 10 additions & 5 deletions GANDLF/compute/forward_pass.py
Original file line number Diff line number Diff line change
Expand Up @@ -337,11 +337,16 @@ def validate_network(
if ext in [".jpg", ".jpeg", ".png"]:
pred_mask = pred_mask.astype(np.uint8)

## special case for 2D
if image.shape[-1] > 1:
result_image = sitk.GetImageFromArray(pred_mask)
else:
result_image = sitk.GetImageFromArray(pred_mask.squeeze(0))
pred_mask = (
pred_mask.squeeze(0)
if pred_mask.shape[0] == 1
else (
pred_mask.squeeze(-1)
if pred_mask.shape[-1] == 1
else pred_mask
)
)
result_image = sitk.GetImageFromArray(pred_mask)
result_image.CopyInformation(img_for_metadata)

# this handles cases that need resampling/resizing
Expand Down
9 changes: 8 additions & 1 deletion GANDLF/entrypoints/anonymizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,16 @@ def _anonymize_images(
type=click.Path(),
help="Output directory or file which will contain the image(s) after anonymization.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(input_dir, config, modality, output_file):
def new_way(input_dir, config, modality, output_file, log_file):
"""Anonymize images/scans in the data directory."""
logger_setup(log_file)
_anonymize_images(input_dir, output_file, config, modality)


Expand Down
2 changes: 1 addition & 1 deletion GANDLF/entrypoints/cli_tool.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
def gandlf(ctx):
"""GANDLF command-line tool."""
ctx.ensure_object(dict)
logger_setup()
# logger_setup()


# registers subcommands: `gandlf anonymizer`, `gandlf run`, etc.
Expand Down
9 changes: 8 additions & 1 deletion GANDLF/entrypoints/collect_stats.py
Original file line number Diff line number Diff line change
Expand Up @@ -191,9 +191,16 @@ def _collect_stats(model_dir: str, output_dir: str):
required=True,
help="Output directory to save stats and plot",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(model_dir: str, output_dir: str):
def new_way(model_dir: str, output_dir: str, log_file: str):
"""Collect statistics from different testing/validation combinations from output directory."""
logger_setup(log_file)
_collect_stats(model_dir=model_dir, output_dir=output_dir)


Expand Down
10 changes: 9 additions & 1 deletion GANDLF/entrypoints/config_generator.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,17 @@ def _generate_config(config: str, strategy: str, output: str):
type=click.Path(file_okay=False, dir_okay=True),
help="Path to output directory.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(config, strategy, output):
def new_way(config, strategy, output, log_file):
"""Generate multiple GaNDLF configurations based on a single baseline GaNDLF for experimentation."""

logger_setup(log_file)
_generate_config(config, strategy, output)


Expand Down
9 changes: 9 additions & 0 deletions GANDLF/entrypoints/construct_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -90,15 +90,24 @@ def _construct_csv(
help="If True, paths in the output data CSV will always be relative to the location"
" of the output data CSV itself.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(
input_dir: str,
channels_id: str,
label_id: Optional[str],
output_file: str,
relativize_paths: bool,
log_file: str,
):
"""Generate training/inference CSV from data directory."""

logger_setup(log_file)
_construct_csv(
input_dir=input_dir,
channels_id=channels_id,
Expand Down
10 changes: 9 additions & 1 deletion GANDLF/entrypoints/debug_info.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,17 @@ def _debug_info(verbose: bool):
is_flag=True,
help="If passed, prints all packages installed as well",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(verbose: bool):
def new_way(verbose: bool, log_file):
"""Displays detailed info about system environment: library versions, settings, etc."""

logger_setup(log_file)
_debug_info(verbose=verbose)


Expand Down
8 changes: 8 additions & 0 deletions GANDLF/entrypoints/deploy.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,12 @@ def _deploy(
help="An optional custom python entrypoint script to use instead of the default specified in mlcube.yaml."
" (Only for inference and metrics)",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(
model: Optional[str],
Expand All @@ -136,8 +142,10 @@ def new_way(
output_dir: str,
requires_gpu: bool,
entrypoint: Optional[str],
log_file: str,
):
"""Generate frozen/deployable versions of trained GaNDLF models."""
logger_setup(log_file)
_deploy(
model=model,
config=config,
Expand Down
9 changes: 9 additions & 0 deletions GANDLF/entrypoints/generate_metrics.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ def _generate_metrics(
default=-1,
help="The value to use for missing predictions as penalty; if `-1`, this does not get added. This is only used in the case where the targets and predictions are passed independently.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@click.option("--raw-input", hidden=True)
@append_copyright_to_help
def new_way(
Expand All @@ -61,8 +67,11 @@ def new_way(
output_file: Optional[str],
missing_prediction: int,
raw_input: str,
log_file: str,
):
"""Metrics calculator."""

logger_setup(log_file)
_generate_metrics(
input_data=input_data,
config=config,
Expand Down
13 changes: 12 additions & 1 deletion GANDLF/entrypoints/optimize_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,11 +46,22 @@ def _optimize_model(
required=False,
type=click.Path(exists=True, file_okay=True, dir_okay=False),
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(
model: str, config: Optional[str] = None, output_path: Optional[str] = None
model: str,
log_file: str,
config: Optional[str] = None,
output_path: Optional[str] = None,
):
"""Generate optimized versions of trained GaNDLF models."""

logger_setup(log_file)
_optimize_model(model=model, config=config, output_path=output_path)


Expand Down
10 changes: 9 additions & 1 deletion GANDLF/entrypoints/patch_miner.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,9 +42,17 @@ def _mine_patches(input_path: str, output_dir: str, config: Optional[str]):
help="config (in YAML) for running the patch miner. Needs 'scale' and 'patch_size' to be defined, "
"otherwise defaults to 16 and (256, 256), respectively.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(input_csv: str, output_dir: str, config: Optional[str]):
def new_way(input_csv: str, output_dir: str, log_file: str, config: Optional[str]):
"""Construct patches from whole slide image(s)."""

logger_setup(log_file)
_mine_patches(input_path=input_csv, output_dir=output_dir, config=config)


Expand Down
9 changes: 9 additions & 0 deletions GANDLF/entrypoints/preprocess.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,12 @@ def _preprocess(
is_flag=True,
help="If passed, applies zero cropping during output creation.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(
config: str,
Expand All @@ -90,8 +96,11 @@ def new_way(
label_pad: str,
apply_augs: bool,
crop_zero: bool,
log_file: str,
):
"""Generate training/inference data which are preprocessed to reduce resource footprint during computation."""

logger_setup(log_file)
_preprocess(
config=config,
input_data=input_data,
Expand Down
10 changes: 9 additions & 1 deletion GANDLF/entrypoints/recover_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,18 @@ def _recover_config(model_dir: Optional[str], mlcube: bool, output_file: str):
type=click.Path(file_okay=True, dir_okay=False),
help="Path to an output file where the config will be written.",
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(model_dir, mlcube, output_file):
def new_way(model_dir, mlcube, output_file, log_file):
"""Recovers a config file from a GaNDLF model. If used from within a deployed GaNDLF MLCube,
attempts to extract the config from the embedded model."""

logger_setup(log_file)
_recover_config(model_dir=model_dir, mlcube=mlcube, output_file=output_file)


Expand Down
9 changes: 9 additions & 0 deletions GANDLF/entrypoints/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,12 @@ def _run(
help="Location to save the output of the inference session. Not used for training.",
)
@click.option("--raw-input", hidden=True)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(
config: str,
Expand All @@ -152,8 +158,11 @@ def new_way(
resume: bool,
output_path: str,
raw_input: str,
log_file: str,
):
"""Semantic segmentation, regression, and classification for medical images using Deep Learning."""

logger_setup(log_file)
_run(
config=config,
input_data=input_data,
Expand Down
10 changes: 9 additions & 1 deletion GANDLF/entrypoints/split_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,9 +49,17 @@ def _split_csv(input_csv: str, output_dir: str, config_path: Optional[str]):
help="The GaNDLF config (in YAML) with the `nested_training` key specified to the folds needed.",
type=click.Path(exists=True, file_okay=True, dir_okay=False),
)
@click.option(
"--log-file",
type=click.Path(),
default=None,
help="Output file which will contain the logs.",
)
@append_copyright_to_help
def new_way(input_csv: str, output_dir: str, config: Optional[str]):
def new_way(input_csv: str, output_dir: str, log_file: str, config: Optional[str]):
"""Split the data into training, validation, and testing sets and save them as csvs in the output directory."""

logger_setup(log_file)
_split_csv(input_csv, output_dir, config)


Expand Down
Loading

0 comments on commit 0c2c1da

Please sign in to comment.