Skip to content

Mixed precision training, new models and improvements

Compare
Choose a tag to compare
@fmassa fmassa released this 28 Jul 15:04
78ed10c

Highlights

Mixed precision support for all models

torchvision models now support mixed-precision training via the new torch.cuda.amp package. Using mixed precision support is easy: just wrap the model and the loss inside a torch.cuda.amp.autocast context manager. Here is an example with Faster R-CNN:

import torch, torchvision

device = torch.device('cuda')

model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.to(device)

input = [torch.rand(3, 300, 400, device=device)]
boxes = torch.rand((5, 4), dtype=torch.float32, device=device)
boxes[:, 2:] += boxes[:, :2]
target = [{"boxes": boxes,
          "labels": torch.zeros(5, dtype=torch.int64, device=device),
          "image_id": 4,
          "area": torch.zeros(5, dtype=torch.float32, device=device),
          "iscrowd": torch.zeros((5,), dtype=torch.int64, device=device)}]

# use automatic mixed precision
with torch.cuda.amp.autocast():
    loss_dict = model(input, target)
losses = sum(loss for loss in loss_dict.values())
# perform backward outside of autocast context manager
losses.backward()

New pre-trained segmentation models

This releases adds pre-trained weights for the ResNet50 variants of Fully-Convolutional Networks (FCN) and DeepLabV3.
They are available under torchvision.models.segmentation, and can be obtained as follows:

torchvision.models.segmentation.fcn_resnet50(pretrained=True)
torchvision.models.segmentation.deeplabv3_resnet50(pretrained=True)

They obtain the following accuracies:

Network mean IoU global pixelwise acc
FCN ResNet50 60.5 91.4
DeepLabV3 ResNet50 66.4 92.4

Improved ONNX support for Faster / Mask / Keypoint R-CNN

This release restores ONNX support for the R-CNN family of models that had been temporarily dropped in the 0.6.0 release, and additionally fixes a number of corner cases in the ONNX export for these models.
Notable improvements includes support for dynamic input shape exports, including images with no detections.

Backwards Incompatible Changes

  • [Transforms] Fix for integer fill value in constant padding (#2284)
  • [Models] Replace L1 loss with smooth L1 loss in Faster R-CNN for better performance (#2113)
  • [Transforms] Use torch.rand instead of random.random() for random transforms (#2520)

New Features

  • [Models] Add mixed-precision support (#2366, #2384)
  • [Models] Add fcn_resnet50 and deeplabv3_resnet50 pretrained models. (#2086, #2091)
  • [Ops] Added eps attribute to FrozenBatchNorm2d (#2190)
  • [Transforms] Add convert_image_dtype to functionals (#2078)
  • [Transforms] Add pil_to_tensor to functionals (#2092)

Bug Fixes

  • [JIT] Fix virtualenv and torchhub support by removing eager scripting calls (#2248)
  • [IO] Fix write_video when floating point FPS is passed (#2334)
  • [IO] Fix missing compilation files for video-reader (#2183)
  • [IO] Fix missing include for OSX in video decoder (#2224)
  • [IO] Fix overflow error for large buffers. (#2303)
  • [Ops] Fix wrong clamping in RoIAlign with aligned=True (#2438)
  • [Ops] Fix corner case in interpolate (#2146)
  • [Ops] Fix the use of contiguous() in C++ kernels (#2131)
  • [Ops] Restore support of tuple of Tensors for region pooling ops (#2199)
  • [Datasets] Fix bug related with trailing slash on UCF-101 dataset (#2186)
  • [Models] Make copy of targets in GeneralizedRCNNTransform (#2227)
  • [Models] Fix DenseNet issue with gradient checkpoints (#2236)
  • [ONNX] Fix ONNX implementation ofheatmaps_to_keypoints in KeypointRCNN (#2312)
  • [ONNX] Fix export of images with no detection for Faster / Mask / Keypoint R-CNN (#2126, #2215, #2272)

Deprecations

  • [Ops] Deprecate Conv2d, ConvTranspose2d and BatchNorm2d (#2244)
  • [Ops] Deprecate interpolate in favor of PyTorch's implementation (#2252)

Improvements

Datasets

  • Fix DatasetFolder error message (#2143)
  • Change range(len) to enumerate in DatasetFolder (#2153)
  • [DOC] Fix link URL to Flickr8k (#2178)
  • [DOC] Add CelebA to docs (#2107)
  • [DOC] Improve documentation of DatasetFolder and ImageFolder (#2112)

TorchHub

  • Fix torchhub tests due to numerical changes in torch.sum (#2361)
  • Add all the latest models to hubconf (#2189)

Transforms

  • Add fill argument to __repr__ of RandomRotation (#2340)
  • Add tensor support for adjust_hue (#2300, #2355)
  • Make ColorJitter torchscriptable (#2298)
  • Make RandomHorizontalFlip and RandomVerticalFlip torchscriptable (#2282)
  • [DOC] Use consistent symbols in the doc of Normalize to avoid confusion (#2181)
  • [DOC] Fix typo in hflip in functional.py (#2177)
  • [DOC] Fix spelling errors in functional.py (#2333)

IO

  • Refactor video.py to improve clarity (#2335)
  • Save memory by not storing full frames in read_video_timestamps (#2202, #2268)
  • Improve warning when video_reader backend is not available (#2225)
  • Set should_buffer to True by default in _read_from_stream (#2201)
  • [Test] Temporarily disable one PyAV test (#2150)

Models

  • Improve target checks in GeneralizedRCNN (#2207, #2258)
  • Use Module objects instead of functions for some layers of Inception3 (#2287)
  • Add support for other normalizations in MobileNetV2 (#2267)
  • Expose layer freezing option to detection models (#2160, #2242)
  • Make ASPP-Layer in DeepLab more generic (#2174)
  • Faster initialization for Inception family of models (#2170, #2211)
  • Make norm_layer as parameters in models/detection/backbone_utils.py (#2081)
  • Updates integer division to use floor division operator (#2234, #2243)
  • [JIT] Clean up no longer needed workarounds for torchscript support (#2249, #2261, #2210)
  • [DOC] Add docs to clarify aspect ratio definition in RPN. (#2185)
  • [DOC] Fix roi_heads argument name in doctstring of GeneralizedRCNN (#2093)
  • [DOC] Fix type annotation in RPN docstring (#2149)
  • [DOC] add clarifications to Object detection reference documentation (#2241)
  • [Test] Add tests for negative samples for Mask R-CNN and Keypoint R-CNN (#2069)

Reference scripts

  • Add support for SyncBatchNorm in QAT reference script (#2230, #2280)
  • Fix training resuming in references/segmentation (#2142)
  • Rename image to images in references/detection/engine.py (#2187)

ONNX

  • Add support for dynamic input shape export in R-CNN models (#2087)

Ops

  • Added number of features in FrozenBatchNorm2d __repr__ (#2168)
  • improve consistency among box IoU CPU / GPU calculations (#2072)
  • Avoid using in header files (#2257)
  • Make ceil_div __host__ __device__ (#2217)
  • Don't include CUDAApplyUtils.cuh (#2127)
  • Add namespace to avoid conflict with ATen version of channel_shuffle() (#2206)
  • [DOC] Update the statement of supporting torchscript ops (#2343)
  • [DOC] Update torchvision ops in doc (#2341)
  • [DOC] Improve documentation for NMS (#2159)
  • [Test] Add more tests to NMS (#2279)

Misc

  • Add PyTorch version compatibility table to README (#2260)
  • Fix lint (#2182, #2226, #2070)
  • Update version to 0.6.0 in CMake (#2140)
  • Remove mock (#2096)
  • Remove warning about deprecated (#2064)
  • Cleanup unused import (#2067)
  • Type annotations for torchvision/utils.py (#2034)

CI

  • Add version suffix to build version
  • Add backslash to escape
  • Add workflows to run on tag
  • Bump version to 0.7.0, pin PyTorch to 1.6.0
  • Update link for cudnn 10.2 (#2277)
  • Fix binary builds with CUDA 9.2 on Windows (#2273)
  • Remove Python 3.5 from CI (#2158)
  • Improvements to CI infra (#2075, #2071, #2058, #2073, #2099, #2137, #2204, #2264, #2274, #2319)
  • Master version bump 0.6 -> 0.7 (#2102)
  • Add test channels for pytorch version functions (#2208)
  • Add static type check with mypy (#2195, #1696, #2247)