Skip to content

Commit

Permalink
Merge remote-tracking branch 'yarikoptic/enh-codespell'
Browse files Browse the repository at this point in the history
  • Loading branch information
FabianIsensee committed Oct 6, 2023
2 parents ef4e3da + b046559 commit de48541
Show file tree
Hide file tree
Showing 24 changed files with 66 additions and 39 deletions.
22 changes: 22 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Codespell

on:
push:
branches: [master]
pull_request:
branches: [master]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2
2 changes: 1 addition & 1 deletion documentation/dataset_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ images). So these images could for example be a T1 and a T2 MRI (or whatever els
channels MUST have the same geometry (same shape, spacing (if applicable) etc.) and
must be co-registered (if applicable). Input channels are identified by nnU-Net by their FILE_ENDING: a four-digit integer at the end
of the filename. Image files must therefore follow the following naming convention: {CASE_IDENTIFIER}_{XXXX}.{FILE_ENDING}.
Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/chanel, e.g., “0000” for T1, “0001” for
Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/channel, e.g., “0000” for T1, “0001” for
T2 MRI, …) and FILE_ENDING is the file extension used by your image format (.png, .nii.gz, ...). See below for concrete examples.
The dataset.json file connects channel names with the channel identifiers in the 'channel_names' key (see below for details).

Expand Down
2 changes: 1 addition & 1 deletion documentation/how_to_use_nnunet.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ wait
**Important: The first time a training is run nnU-Net will extract the preprocessed data into uncompressed numpy
arrays for speed reasons! This operation must be completed before starting more than one training of the same
configuration! Wait with starting subsequent folds until the first training is using the GPU! Depending on the
dataset size and your System this should oly take a couple of minutes at most.**
dataset size and your System this should only take a couple of minutes at most.**

If you insist on running DDP multi-GPU training, we got you covered:

Expand Down
2 changes: 1 addition & 1 deletion documentation/set_environment_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
nnU-Net requires some environment variables so that it always knows where the raw data, preprocessed data and trained
models are. Depending on the operating system, these environment variables need to be set in different ways.

Variables can either be set permanently (recommended!) or you can decide to set them everytime you call nnU-Net.
Variables can either be set permanently (recommended!) or you can decide to set them every time you call nnU-Net.

# Linux & MacOS

Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/dataset_conversion/generate_dataset_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def generate_dataset_json(output_folder: str,
labels[l] = int(labels[l])

dataset_json = {
'channel_names': channel_names, # previously this was called 'modality'. I didnt like this so this is
'channel_names': channel_names, # previously this was called 'modality'. I didn't like this so this is
# channel_names now. Live with it.
'labels': labels,
'numTraining': num_training_cases,
Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/ensembling/ensemble.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ def ensemble_crossvalidations(list_of_trained_model_folders: List[str],
for f in folds:
if not isdir(join(tr, f'fold_{f}', 'validation')):
raise RuntimeError(f'Expected model output directory does not exist. You must train all requested '
f'folds of the speficied model.\nModel: {tr}\nFold: {f}')
f'folds of the specified model.\nModel: {tr}\nFold: {f}')
files_here = subfiles(join(tr, f'fold_{f}', 'validation'), suffix='.npz', join=False)
if len(files_here) == 0:
raise RuntimeError(f"No .npz files found in folder {join(tr, f'fold_{f}', 'validation')}. Rerun your "
Expand Down
8 changes: 4 additions & 4 deletions nnunetv2/evaluation/evaluate_predictions.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ def key_to_label_or_region(key: str):
except ValueError:
key = key.replace('(', '')
key = key.replace(')', '')
splitted = key.split(',')
return tuple([int(i) for i in splitted if len(i) > 0])
split = key.split(',')
return tuple([int(i) for i in split if len(i) > 0])


def save_summary_json(results: dict, output_file: str):
Expand Down Expand Up @@ -227,7 +227,7 @@ def evaluate_folder_entry_point():
help='Output file. Optional. Default: pred_folder/summary.json')
parser.add_argument('-np', type=int, required=False, default=default_num_processes,
help=f'number of processes used. Optional. Default: {default_num_processes}')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')
args = parser.parse_args()
compute_metrics_on_folder2(args.gt_folder, args.pred_folder, args.djfile, args.pfile, args.o, args.np, chill=args.chill)

Expand All @@ -245,7 +245,7 @@ def evaluate_simple_entry_point():
help='Output file. Optional. Default: pred_folder/summary.json')
parser.add_argument('-np', type=int, required=False, default=default_num_processes,
help=f'number of processes used. Optional. Default: {default_num_processes}')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')

args = parser.parse_args()
compute_metrics_on_folder_simple(args.gt_folder, args.pred_folder, args.l, args.o, args.np, args.il, chill=args.chill)
Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/evaluation/find_best_configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@ def find_best_configuration_entry_point():
help='Set this flag to disable ensembling')
parser.add_argument('--no_overwrite', action='store_true',
help='If set we will not overwrite already ensembled files etc. May speed up concecutive '
'runs of this command (why would oyu want to do that?) at the risk of not updating '
'runs of this command (why would you want to do that?) at the risk of not updating '
'outdated results.')
args = parser.parse_args()

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -520,8 +520,8 @@ def save_plans(self, plans):

def generate_data_identifier(self, configuration_name: str) -> str:
"""
configurations are unique within each plans file but differnet plans file can have configurations with the
same name. In order to distinguish the assiciated data we need a data identifier that reflects not just the
configurations are unique within each plans file but different plans file can have configurations with the
same name. In order to distinguish the associated data we need a data identifier that reflects not just the
config but also the plans it originates from
"""
return self.plans_identifier + '_' + configuration_name
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def extract_fingerprint_entry():
help='[OPTIONAL] Set this flag to overwrite existing fingerprints. If this flag is not set and a '
'fingerprint already exists, the fingerprint extractor will not run.')
parser.add_argument('--verbose', required=False, action='store_true',
help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
'Recommended for cluster environments')
args, unrecognized_args = parser.parse_known_args()
extract_fingerprints(args.d, args.fpe, args.np, args.verify_dataset_integrity, args.clean, args.verbose)
Expand Down Expand Up @@ -91,7 +91,7 @@ def preprocess_entry():
"DECREASE -np IF YOUR RAM FILLS UP TOO MUCH!. Default: 8 processes for 2d, 4 "
"for 3d_fullres, 8 for 3d_lowres and 4 for everything else")
parser.add_argument('--verbose', required=False, action='store_true',
help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
'Recommended for cluster environments')
args, unrecognized_args = parser.parse_known_args()
if args.np is None:
Expand Down Expand Up @@ -173,7 +173,7 @@ def plan_and_preprocess_entry():
"DECREASE -np IF YOUR RAM FILLS UP TOO MUCH!. Default: 8 processes for 2d, 4 "
"for 3d_fullres, 8 for 3d_lowres and 4 for everything else")
parser.add_argument('--verbose', required=False, action='store_true',
help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
'Recommended for cluster environments')
args = parser.parse_args()

Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/experiment_planning/verify_dataset_integrity.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ def verify_dataset_integrity(folder: str, num_processes: int = 8) -> None:
missing_labels.append(dataset[k]['label'])
ok = False
if not ok:
raise FileNotFoundError(f"Some expeted files were missing. Make sure you are properly referencing them "
raise FileNotFoundError(f"Some expected files were missing. Make sure you are properly referencing them "
f"in the dataset.json. Or use imagesTr & labelsTr folders!\nMissing images:"
f"\n{missing_images}\n\nMissing labels:\n{missing_labels}")
else:
Expand Down
4 changes: 2 additions & 2 deletions nnunetv2/imageio/base_reader_writer.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
:return:
1) a np.ndarray of shape (c, x, y, z) where c is the number of image channels (can be 1) and x, y, z are
the spatial dimensions (set x=1 for 2D! Example: (3, 1, 224, 224) for RGB image).
2) a dictionary with metadata. This can be anything. BUT it HAS to inclue a {'spacing': (a, b, c)} where a
2) a dictionary with metadata. This can be anything. BUT it HAS to include a {'spacing': (a, b, c)} where a
is the spacing of x, b of y and c of z! If an image doesn't have spacing, just set this to 1. For 2D, set
a=999 (largest spacing value! Make it larger than b and c)
Expand All @@ -79,7 +79,7 @@ def read_seg(self, seg_fname: str) -> Tuple[np.ndarray, dict]:
:return:
1) a np.ndarray of shape (1, x, y, z) where x, y, z are
the spatial dimensions (set x=1 for 2D! Example: (1, 1, 224, 224) for 2D segmentation).
2) a dictionary with metadata. This can be anything. BUT it HAS to inclue a {'spacing': (a, b, c)} where a
2) a dictionary with metadata. This can be anything. BUT it HAS to include a {'spacing': (a, b, c)} where a
is the spacing of x, b of y and c of z! If an image doesn't have spacing, just set this to 1. For 2D, set
a=999 (largest spacing value! Make it larger than b and c)
"""
Expand Down
4 changes: 2 additions & 2 deletions nnunetv2/inference/export_prediction.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ def convert_predicted_logits_to_segmentation_with_correct_shape(predicted_logits
properties_dict['shape_after_cropping_and_before_resampling'],
current_spacing,
properties_dict['spacing'])
# return value of resampling_fn_probabilities can be ndarray or Tensor but that doesnt matter because
# apply_inference_nonlin will covnert to torch
# return value of resampling_fn_probabilities can be ndarray or Tensor but that does not matter because
# apply_inference_nonlin will convert to torch
predicted_probabilities = label_manager.apply_inference_nonlin(predicted_logits)
del predicted_logits
segmentation = label_manager.convert_probabilities_to_segmentation(predicted_probabilities)
Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/inference/predict_from_raw_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -805,7 +805,7 @@ def predict_entry_point():
if not isdir(args.o):
maybe_mkdir_p(args.o)

# slightly passive agressive haha
# slightly passive aggressive haha
assert args.part_id < args.num_parts, 'Do you even read the documentation? See nnUNetv2_predict -h.'

assert args.device in ['cpu', 'cuda',
Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/inference/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ need for the _0000 suffix anymore! This can be useful in situations where you ha
Remember that the files must be given as 'list of lists' where each entry in the outer list is a case to be predicted
and the inner list contains all the files belonging to that case. There is just one file for datasets with just one
input modality (such as CT) but may be more files for others (such as MRI where there is sometimes T1, T2, Flair etc).
IMPORTANT: the order in wich the files for each case are given must match the order of the channels as defined in the
IMPORTANT: the order in which the files for each case are given must match the order of the channels as defined in the
dataset.json!

If you give files as input, you need to give individual output files as output!
Expand Down
4 changes: 2 additions & 2 deletions nnunetv2/postprocessing/remove_connected_components.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ def determine_postprocessing(folder_predictions: str,
if plans_file_or_dict is None:
expected_plans_file = join(folder_predictions, 'plans.json')
if not isfile(expected_plans_file):
raise RuntimeError(f"Expected plans file missing: {expected_plans_file}. The plans fils should have been "
raise RuntimeError(f"Expected plans file missing: {expected_plans_file}. The plans files should have been "
f"created while running nnUNetv2_predict. Sadge.")
plans_file_or_dict = load_json(expected_plans_file)
plans_manager = PlansManager(plans_file_or_dict)
Expand All @@ -80,7 +80,7 @@ def determine_postprocessing(folder_predictions: str,
expected_dataset_json_file = join(folder_predictions, 'dataset.json')
if not isfile(expected_dataset_json_file):
raise RuntimeError(
f"Expected plans file missing: {expected_dataset_json_file}. The plans fils should have been "
f"Expected plans file missing: {expected_dataset_json_file}. The plans files should have been "
f"created while running nnUNetv2_predict. Sadge.")
dataset_json_file_or_dict = load_json(expected_dataset_json_file)

Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/run/load_pretrained_weights.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ def load_pretrained_weights(network, fname, verbose=False):
shape is also the same. Segmentation layers (the 1x1(x1) layers that produce the segmentation maps)
identified by keys ending with '.seg_layers') are not transferred!
If the pretrained weights were optained with a training outside nnU-Net and DDP or torch.optimize was used,
If the pretrained weights were obtained with a training outside nnU-Net and DDP or torch.optimize was used,
you need to change the keys of the pretrained state_dict. DDP adds a 'module.' prefix and torch.optim adds
'_orig_mod'. You DO NOT need to worry about this if pretraining was done with nnU-Net as
nnUNetTrainer.save_checkpoint takes care of that!
Expand Down
4 changes: 2 additions & 2 deletions nnunetv2/training/dataloading/nnunet_dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -123,7 +123,7 @@ def load_case(self, key):

# this should have the properties
ds = nnUNetDataset(folder, num_images_properties_loading_threshold=1000)
# now rename the properties file so that it doesnt exist anymore
# now rename the properties file so that it does not exist anymore
shutil.move(join(folder, 'liver_0.pkl'), join(folder, 'liver_XXX.pkl'))
# now we should still be able to access the properties because they have already been loaded
ks = ds['liver_0'].keys()
Expand All @@ -133,7 +133,7 @@ def load_case(self, key):

# this should not have the properties
ds = nnUNetDataset(folder, num_images_properties_loading_threshold=0)
# now rename the properties file so that it doesnt exist anymore
# now rename the properties file so that it does not exist anymore
shutil.move(join(folder, 'liver_0.pkl'), join(folder, 'liver_XXX.pkl'))
# now this should crash
try:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ def build_network_architecture(plans_manager: PlansManager,
'is non-standard (maybe your own?). Yo\'ll have to dive ' \
'into either this ' \
'function (get_network_from_plans) or ' \
'the init of your nnUNetModule to accomodate that.'
'the init of your nnUNetModule to accommodate that.'
network_class = mapping[segmentation_network_class_name]

conv_or_blocks_per_stage = {
Expand Down
4 changes: 2 additions & 2 deletions nnunetv2/utilities/dataset_name_id_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,5 +70,5 @@ def maybe_convert_to_dataset_name(dataset_name_or_id: Union[int, str]) -> str:
except ValueError:
raise ValueError("dataset_name_or_id was a string and did not start with 'Dataset' so we tried to "
"convert it to a dataset ID (int). That failed, however. Please give an integer number "
"('1', '2', etc) or a correct tast name. Your input: %s" % dataset_name_or_id)
return convert_id_to_dataset_name(dataset_name_or_id)
"('1', '2', etc) or a correct dataset name. Your input: %s" % dataset_name_or_id)
return convert_id_to_dataset_name(dataset_name_or_id)
12 changes: 6 additions & 6 deletions nnunetv2/utilities/file_path_utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ def parse_dataset_trainer_plans_configuration_from_path(path: str):
assert len(folders[:idx]) >= 2, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
if folders[idx - 2].startswith('Dataset'):
splitted = folders[idx - 1].split('__')
assert len(splitted) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
split = folders[idx - 1].split('__')
assert len(split) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
return folders[idx - 2], *splitted
return folders[idx - 2], *split
else:
# we can only check for dataset followed by a string that is separable into three strings by splitting with '__'
# look for DatasetXXX
Expand All @@ -51,10 +51,10 @@ def parse_dataset_trainer_plans_configuration_from_path(path: str):
idx = dataset_folder.index(True)
assert len(folders) >= (idx + 1), 'Bad path, cannot extract what I need. Your path needs to be at least ' \
'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
splitted = folders[idx + 1].split('__')
assert len(splitted) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
split = folders[idx + 1].split('__')
assert len(split) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
return folders[idx], *splitted
return folders[idx], *split


def get_ensemble_name(model1_folder, model2_folder, folds: Tuple[int, ...]):
Expand Down
Loading

0 comments on commit de48541

Please sign in to comment.