Skip to content

Stage 4: Image Selection and Resizing

CyberMeow edited this page Dec 24, 2023 · 7 revisions

Resize images with characters to training folder with resizing

  • If you start from this stage, please set --src_dir to the folder containing both classified and raw (/path/to/dataset_dir/intermediate/{image_type} by default).
  • Output folder: /path/to/dataset_dir/tranining/{image_type}
  • After this stage, you can go over the images to select those you want to keep.

The images obtained after this stage are meant to be the ones used for training.

Image selection criteria

The folder names from .../classified directory are first read and save in the characters field of the images' metadata (cropped and original alike). Folder names should of the form {number}_{chracter_name} or {character_name} as long as it does not start with {number}_ (otherwise the program parses with the first format). After this, the images are selected in the following way:

  • For cropped images: select those with size smaller than half of the original image
  • For original images:
    • Selected those with characters
    • Selected --n_anime_reg images with no characters for style regularization (use all if there are not enough no character images) in the case of screenshots pipeline

⚠️ The cropped images are mapped back to the original images using file paths. Thus, moving the original images to another place, removing them, or renaming them would cause errors at this stage. You can use correct_path_field.py if the file names remain untouched while the files are moved.

Command line arguments

  • no_cropped_in_dataset: Exclude cropped images from the dataset.
    Example usage: --no_cropped_in_dataset
  • no_original_in_dataset: Exclude original images from the dataset.
    Example usage: --no_original_in_dataset
  • no_resize: Skip the image resizing process and copy files as they are.
    Example usage: --no_resize
  • max_size: Maximum image size to resize to, aligning the shorter edge. The default is 768.
    Example usage: --max_size 1024
  • image_save_ext: Specify the image extension for resized images. The default is .webp.
    Example usage: --image_save_ext .jpg
  • filter_again: Enable filtering of repeated images again at this stage. This is useful since cropped images can be similar even if the full images are not.
    Example usage: --filter_again`
  • overwrite_emb_init_info: The file emb_init.json is also saved in the output folder at this stage. By default the original content of the file is kept if it exists. With this argument the file is completely overwritten.
    Example usage: --overwrite_emb_init_info

Arguments specific to screenshots pipeline

  • n_anime_reg: Set the number of images with no characters to include in the dataset. The default number is 500.
    Example usage: --n_anime_reg 1000

Arguments specific to booru pipeline

  • character_overwrite_uncropped: Overwrite existing character metadata for uncropped images. This is only relevant for the booru pipeline as this is always the case otherwise.
    Example usage: --character_overwrite_uncropped
  • character_remove_unclassified: Remove unclassified characters from the character metadata field.
    Example usage: --character_remove_unclassified