Mixed precision training, new models and improvements
Highlights
Mixed precision support for all models
torchvision models now support mixed-precision training via the new torch.cuda.amp
package. Using mixed precision support is easy: just wrap the model and the loss inside a torch.cuda.amp.autocast
context manager. Here is an example with Faster R-CNN:
import torch, torchvision
device = torch.device('cuda')
model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.to(device)
input = [torch.rand(3, 300, 400, device=device)]
boxes = torch.rand((5, 4), dtype=torch.float32, device=device)
boxes[:, 2:] += boxes[:, :2]
target = [{"boxes": boxes,
"labels": torch.zeros(5, dtype=torch.int64, device=device),
"image_id": 4,
"area": torch.zeros(5, dtype=torch.float32, device=device),
"iscrowd": torch.zeros((5,), dtype=torch.int64, device=device)}]
# use automatic mixed precision
with torch.cuda.amp.autocast():
loss_dict = model(input, target)
losses = sum(loss for loss in loss_dict.values())
# perform backward outside of autocast context manager
losses.backward()
New pre-trained segmentation models
This releases adds pre-trained weights for the ResNet50 variants of Fully-Convolutional Networks (FCN) and DeepLabV3.
They are available under torchvision.models.segmentation
, and can be obtained as follows:
torchvision.models.segmentation.fcn_resnet50(pretrained=True)
torchvision.models.segmentation.deeplabv3_resnet50(pretrained=True)
They obtain the following accuracies:
Network | mean IoU | global pixelwise acc |
---|---|---|
FCN ResNet50 | 60.5 | 91.4 |
DeepLabV3 ResNet50 | 66.4 | 92.4 |
Improved ONNX support for Faster / Mask / Keypoint R-CNN
This release restores ONNX support for the R-CNN family of models that had been temporarily dropped in the 0.6.0 release, and additionally fixes a number of corner cases in the ONNX export for these models.
Notable improvements includes support for dynamic input shape exports, including images with no detections.
Backwards Incompatible Changes
- [Transforms] Fix for integer fill value in constant padding (#2284)
- [Models] Replace L1 loss with smooth L1 loss in Faster R-CNN for better performance (#2113)
- [Transforms] Use
torch.rand
instead ofrandom.random()
for random transforms (#2520)
New Features
- [Models] Add mixed-precision support (#2366, #2384)
- [Models] Add
fcn_resnet50
anddeeplabv3_resnet50
pretrained models. (#2086, #2091) - [Ops] Added eps attribute to FrozenBatchNorm2d (#2190)
- [Transforms] Add
convert_image_dtype
to functionals (#2078) - [Transforms] Add
pil_to_tensor
to functionals (#2092)
Bug Fixes
- [JIT] Fix virtualenv and torchhub support by removing eager scripting calls (#2248)
- [IO] Fix
write_video
when floating point FPS is passed (#2334) - [IO] Fix missing compilation files for video-reader (#2183)
- [IO] Fix missing include for OSX in video decoder (#2224)
- [IO] Fix overflow error for large buffers. (#2303)
- [Ops] Fix wrong clamping in RoIAlign with
aligned=True
(#2438) - [Ops] Fix corner case in
interpolate
(#2146) - [Ops] Fix the use of
contiguous()
in C++ kernels (#2131) - [Ops] Restore support of tuple of Tensors for region pooling ops (#2199)
- [Datasets] Fix bug related with trailing slash on UCF-101 dataset (#2186)
- [Models] Make copy of targets in GeneralizedRCNNTransform (#2227)
- [Models] Fix DenseNet issue with gradient checkpoints (#2236)
- [ONNX] Fix ONNX implementation of
heatmaps_to_keypoints
in KeypointRCNN (#2312) - [ONNX] Fix export of images with no detection for Faster / Mask / Keypoint R-CNN (#2126, #2215, #2272)
Deprecations
- [Ops] Deprecate Conv2d, ConvTranspose2d and BatchNorm2d (#2244)
- [Ops] Deprecate
interpolate
in favor of PyTorch's implementation (#2252)
Improvements
Datasets
- Fix DatasetFolder error message (#2143)
- Change
range(len)
toenumerate
inDatasetFolder
(#2153) - [DOC] Fix link URL to Flickr8k (#2178)
- [DOC] Add CelebA to docs (#2107)
- [DOC] Improve documentation of
DatasetFolder
andImageFolder
(#2112)
TorchHub
- Fix torchhub tests due to numerical changes in torch.sum (#2361)
- Add all the latest models to hubconf (#2189)
Transforms
- Add
fill
argument to__repr__
ofRandomRotation
(#2340) - Add tensor support for
adjust_hue
(#2300, #2355) - Make
ColorJitter
torchscriptable (#2298) - Make
RandomHorizontalFlip
andRandomVerticalFlip
torchscriptable (#2282) - [DOC] Use consistent symbols in the doc of
Normalize
to avoid confusion (#2181) - [DOC] Fix typo in
hflip
infunctional.py
(#2177) - [DOC] Fix spelling errors in
functional.py
(#2333)
IO
- Refactor
video.py
to improve clarity (#2335) - Save memory by not storing full frames in
read_video_timestamps
(#2202, #2268) - Improve warning when
video_reader
backend is not available (#2225) - Set
should_buffer
to True by default in_read_from_stream
(#2201) - [Test] Temporarily disable one PyAV test (#2150)
Models
- Improve target checks in GeneralizedRCNN (#2207, #2258)
- Use Module objects instead of functions for some layers of Inception3 (#2287)
- Add support for other normalizations in MobileNetV2 (#2267)
- Expose layer freezing option to detection models (#2160, #2242)
- Make ASPP-Layer in DeepLab more generic (#2174)
- Faster initialization for Inception family of models (#2170, #2211)
- Make
norm_layer
as parameters inmodels/detection/backbone_utils.py
(#2081) - Updates integer division to use floor division operator (#2234, #2243)
- [JIT] Clean up no longer needed workarounds for torchscript support (#2249, #2261, #2210)
- [DOC] Add docs to clarify aspect ratio definition in RPN. (#2185)
- [DOC] Fix roi_heads argument name in doctstring of GeneralizedRCNN (#2093)
- [DOC] Fix type annotation in RPN docstring (#2149)
- [DOC] add clarifications to Object detection reference documentation (#2241)
- [Test] Add tests for negative samples for Mask R-CNN and Keypoint R-CNN (#2069)
Reference scripts
- Add support for SyncBatchNorm in QAT reference script (#2230, #2280)
- Fix training resuming in
references/segmentation
(#2142) - Rename
image
toimages
inreferences/detection/engine.py
(#2187)
ONNX
- Add support for dynamic input shape export in R-CNN models (#2087)
Ops
- Added number of features in FrozenBatchNorm2d
__repr__
(#2168) - improve consistency among box IoU CPU / GPU calculations (#2072)
- Avoid
using
in header files (#2257) - Make
ceil_div
__host__ __device__
(#2217) - Don't include CUDAApplyUtils.cuh (#2127)
- Add namespace to avoid conflict with ATen version of
channel_shuffle()
(#2206) - [DOC] Update the statement of supporting torchscript ops (#2343)
- [DOC] Update torchvision ops in doc (#2341)
- [DOC] Improve documentation for NMS (#2159)
- [Test] Add more tests to NMS (#2279)
Misc
- Add PyTorch version compatibility table to README (#2260)
- Fix lint (#2182, #2226, #2070)
- Update version to 0.6.0 in CMake (#2140)
- Remove mock (#2096)
- Remove warning about deprecated (#2064)
- Cleanup unused import (#2067)
- Type annotations for torchvision/utils.py (#2034)
CI
- Add version suffix to build version
- Add backslash to escape
- Add workflows to run on tag
- Bump version to 0.7.0, pin PyTorch to 1.6.0
- Update link for cudnn 10.2 (#2277)
- Fix binary builds with CUDA 9.2 on Windows (#2273)
- Remove Python 3.5 from CI (#2158)
- Improvements to CI infra (#2075, #2071, #2058, #2073, #2099, #2137, #2204, #2264, #2274, #2319)
- Master version bump 0.6 -> 0.7 (#2102)
- Add test channels for pytorch version functions (#2208)
- Add static type check with mypy (#2195, #1696, #2247)