Highlights

Mixed precision support for all models

torchvision models now support mixed-precision training via the new torch.cuda.amp package. Using mixed precision support is easy: just wrap the model and the loss inside a torch.cuda.amp.autocast context manager. Here is an example with Faster R-CNN:

import torch, torchvision

device = torch.device('cuda')

model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.to(device)

input = [torch.rand(3, 300, 400, device=device)]
boxes = torch.rand((5, 4), dtype=torch.float32, device=device)
boxes[:, 2:] += boxes[:, :2]
target = [{"boxes": boxes,
          "labels": torch.zeros(5, dtype=torch.int64, device=device),
          "image_id": 4,
          "area": torch.zeros(5, dtype=torch.float32, device=device),
          "iscrowd": torch.zeros((5,), dtype=torch.int64, device=device)}]

# use automatic mixed precision
with torch.cuda.amp.autocast():
    loss_dict = model(input, target)
losses = sum(loss for loss in loss_dict.values())
# perform backward outside of autocast context manager
losses.backward()

New pre-trained segmentation models

This releases adds pre-trained weights for the ResNet50 variants of Fully-Convolutional Networks (FCN) and DeepLabV3.
They are available under torchvision.models.segmentation, and can be obtained as follows:

torchvision.models.segmentation.fcn_resnet50(pretrained=True)
torchvision.models.segmentation.deeplabv3_resnet50(pretrained=True)

They obtain the following accuracies:

Network	mean IoU	global pixelwise acc
FCN ResNet50	60.5	91.4
DeepLabV3 ResNet50	66.4	92.4

Improved ONNX support for Faster / Mask / Keypoint R-CNN

This release restores ONNX support for the R-CNN family of models that had been temporarily dropped in the 0.6.0 release, and additionally fixes a number of corner cases in the ONNX export for these models.
Notable improvements includes support for dynamic input shape exports, including images with no detections.

Backwards Incompatible Changes

[Transforms] Fix for integer fill value in constant padding (#2284)
[Models] Replace L1 loss with smooth L1 loss in Faster R-CNN for better performance (#2113)
[Transforms] Use torch.rand instead of random.random() for random transforms (#2520)

New Features

[Models] Add mixed-precision support (#2366, #2384)
[Models] Add fcn_resnet50 and deeplabv3_resnet50 pretrained models. (#2086, #2091)
[Ops] Added eps attribute to FrozenBatchNorm2d (#2190)
[Transforms] Add convert_image_dtype to functionals (#2078)
[Transforms] Add pil_to_tensor to functionals (#2092)

Bug Fixes

[JIT] Fix virtualenv and torchhub support by removing eager scripting calls (#2248)
[IO] Fix write_video when floating point FPS is passed (#2334)
[IO] Fix missing compilation files for video-reader (#2183)
[IO] Fix missing include for OSX in video decoder (#2224)
[IO] Fix overflow error for large buffers. (#2303)
[Ops] Fix wrong clamping in RoIAlign with aligned=True (#2438)
[Ops] Fix corner case in interpolate (#2146)
[Ops] Fix the use of contiguous() in C++ kernels (#2131)
[Ops] Restore support of tuple of Tensors for region pooling ops (#2199)
[Datasets] Fix bug related with trailing slash on UCF-101 dataset (#2186)
[Models] Make copy of targets in GeneralizedRCNNTransform (#2227)
[Models] Fix DenseNet issue with gradient checkpoints (#2236)
[ONNX] Fix ONNX implementation ofheatmaps_to_keypoints in KeypointRCNN (#2312)
[ONNX] Fix export of images with no detection for Faster / Mask / Keypoint R-CNN (#2126, #2215, #2272)

Deprecations

[Ops] Deprecate Conv2d, ConvTranspose2d and BatchNorm2d (#2244)
[Ops] Deprecate interpolate in favor of PyTorch's implementation (#2252)

Improvements

Datasets

Fix DatasetFolder error message (#2143)
Change range(len) to enumerate in DatasetFolder (#2153)
[DOC] Fix link URL to Flickr8k (#2178)
[DOC] Add CelebA to docs (#2107)
[DOC] Improve documentation of DatasetFolder and ImageFolder (#2112)

TorchHub

Fix torchhub tests due to numerical changes in torch.sum (#2361)
Add all the latest models to hubconf (#2189)

Transforms

Add fill argument to __repr__ of RandomRotation (#2340)
Add tensor support for adjust_hue (#2300, #2355)
Make ColorJitter torchscriptable (#2298)
Make RandomHorizontalFlip and RandomVerticalFlip torchscriptable (#2282)
[DOC] Use consistent symbols in the doc of Normalize to avoid confusion (#2181)
[DOC] Fix typo in hflip in functional.py (#2177)
[DOC] Fix spelling errors in functional.py (#2333)

IO

Refactor video.py to improve clarity (#2335)
Save memory by not storing full frames in read_video_timestamps (#2202, #2268)
Improve warning when video_reader backend is not available (#2225)
Set should_buffer to True by default in _read_from_stream (#2201)
[Test] Temporarily disable one PyAV test (#2150)

Models

Improve target checks in GeneralizedRCNN (#2207, #2258)
Use Module objects instead of functions for some layers of Inception3 (#2287)
Add support for other normalizations in MobileNetV2 (#2267)
Expose layer freezing option to detection models (#2160, #2242)
Make ASPP-Layer in DeepLab more generic (#2174)
Faster initialization for Inception family of models (#2170, #2211)
Make norm_layer as parameters in models/detection/backbone_utils.py (#2081)
Updates integer division to use floor division operator (#2234, #2243)
[JIT] Clean up no longer needed workarounds for torchscript support (#2249, #2261, #2210)
[DOC] Add docs to clarify aspect ratio definition in RPN. (#2185)
[DOC] Fix roi_heads argument name in doctstring of GeneralizedRCNN (#2093)
[DOC] Fix type annotation in RPN docstring (#2149)
[DOC] add clarifications to Object detection reference documentation (#2241)
[Test] Add tests for negative samples for Mask R-CNN and Keypoint R-CNN (#2069)

Reference scripts

Add support for SyncBatchNorm in QAT reference script (#2230, #2280)
Fix training resuming in references/segmentation (#2142)
Rename image to images in references/detection/engine.py (#2187)

ONNX

Add support for dynamic input shape export in R-CNN models (#2087)

Ops

Added number of features in FrozenBatchNorm2d __repr__ (#2168)
improve consistency among box IoU CPU / GPU calculations (#2072)
Avoid using in header files (#2257)
Make ceil_div __host__ __device__ (#2217)
Don't include CUDAApplyUtils.cuh (#2127)
Add namespace to avoid conflict with ATen version of channel_shuffle() (#2206)
[DOC] Update the statement of supporting torchscript ops (#2343)
[DOC] Update torchvision ops in doc (#2341)
[DOC] Improve documentation for NMS (#2159)
[Test] Add more tests to NMS (#2279)

Misc

Add PyTorch version compatibility table to README (#2260)
Fix lint (#2182, #2226, #2070)
Update version to 0.6.0 in CMake (#2140)
Remove mock (#2096)
Remove warning about deprecated (#2064)
Cleanup unused import (#2067)
Type annotations for torchvision/utils.py (#2034)

CI

Add version suffix to build version
Add backslash to escape
Add workflows to run on tag
Bump version to 0.7.0, pin PyTorch to 1.6.0
Update link for cudnn 10.2 (#2277)
Fix binary builds with CUDA 9.2 on Windows (#2273)
Remove Python 3.5 from CI (#2158)
Improvements to CI infra (#2075, #2071, #2058, #2073, #2099, #2137, #2204, #2264, #2274, #2319)
Master version bump 0.6 -> 0.7 (#2102)
Add test channels for pytorch version functions (#2208)
Add static type check with mypy (#2195, #1696, #2247)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mixed precision training, new models and improvements