21 Oct 16:58

fa347eb

Update dependency on wheels to match version in PyPI

Users were reporting issues installing torchvision on PyPI, this release contains an update to the dependencies for wheels to point directly to torch==0.10.0

Assets 2

21 Oct 15:46

datumbox

v0.11.0

58a60b2

RegNet, EfficientNet, FX Feature Extraction and more

This release introduces the RegNet and EfficientNet architectures, a new FX-based utility to perform Feature Extraction, new data augmentation techniques such as RandAugment and TrivialAugment, updated training recipes that support EMA, Label Smoothing, Learning-Rate Warmup, Mixup and Cutmix, and many more.

Highlights

New Models

RegNet and EfficientNet are two popular architectures that can be scaled to different computational budgets. In this release we include 22 pre-trained weights for their classification variants. The models were trained on ImageNet and can be used as follows:

import torch
from torchvision import models

x = torch.rand(1, 3, 224, 224)

regnet = models.regnet_y_400mf(pretrained=True)
regnet.eval()
predictions = regnet(x)

efficientnet = models.efficientnet_b0(pretrained=True)
efficientnet.eval()
predictions = efficientnet(x)

The accuracies of the pre-trained models obtained on ImageNet val are seen below (see #4403, #4530 and #4293 for more details)

Model	Acc@1	Acc@5
regnet_x_400mf	72.834	90.95
regnet_x_800mf	75.212	92.348
regnet_x_1_6gf	77.04	93.44
regnet_x_3_2gf	78.364	93.992
regnet_x_8gf	79.344	94.686
regnet_x_16gf	80.058	94.944
regnet_x_32gf	80.622	95.248
regnet_y_400mf	74.046	91.716
regnet_y_800mf	76.42	93.136
regnet_y_1_6gf	77.95	93.966
regnet_y_3_2gf	78.948	94.576
regnet_y_8gf	80.032	95.048
regnet_y_16gf	80.424	95.24
regnet_y_32gf	80.878	95.34
EfficientNet-B0	77.692	93.532
EfficientNet-B1	78.642	94.186
EfficientNet-B2	80.608	95.31
EfficientNet-B3	82.008	96.054
EfficientNet-B4	83.384	96.594
EfficientNet-B5	83.444	96.628
EfficientNet-B6	84.008	96.916
EfficientNet-B7	84.122	96.908

We would like to thank Ross Wightman and Luke Melas-Kyriazi for contributing the weights of the EfficientNet variants.

FX-based Feature Extraction

A new Feature Extraction method has been added to our utilities. It uses PyTorch FX and enables us to retrieve the outputs of intermediate layers of a network which is useful for feature extraction and visualization. Here is an example of how to use the new utility:

import torch
from torchvision.models import resnet50
from torchvision.models.feature_extraction import create_feature_extractor


x = torch.rand(1, 3, 224, 224)

model = resnet50()

return_nodes = {
    "layer4.2.relu_2": "layer4"
}
model2 = create_feature_extractor(model, return_nodes=return_nodes)
intermediate_outputs = model2(x)

print(intermediate_outputs['layer4'].shape)

We would like to thank Alexander Soare for developing this utility.

New Data Augmentations

Two new Automatic Augmentation techniques were added: Rand Augment and Trivial Augment. Both methods can be used as drop-in replacement of the AutoAugment technique as seen below:

from torchvision import transforms

t = transforms.RandAugment()
# t = transforms.TrivialAugmentWide()
transformed = t(image)

transform = transforms.Compose([
    transforms.Resize(256),
    transforms.RandAugment(),  # transforms.TrivialAugmentWide()
    transforms.ToTensor()])

We would like to thank Samuel G. Müller for contributing Trivial Augment and for his help on refactoring the AA package.

Updated Training Recipes

We have updated our training reference scripts to add support of Exponential Moving Average, Label Smoothing, Learning-Rate Warmup, Mixup, Cutmix and other SOTA primitives. The above enabled us to improve the classification Acc@1 of some pre-trained models by over 4 points. A major update of the existing pre-trained weights is expected on the next release.

Backward-incompatible changes

[models] Use torch instead of scipy for random initialization of inception and googlenet weights (#4256)

Deprecations

[models] Deprecate the C++ vision::models namespace (#4375)

New Features

[datasets] Add iNaturalist dataset (#4123)
[datasets] Download and Kinetics 400/600/700 Datasets (#3680)
[datasets] Added LFW Dataset (#4255)
[models] Add FX feature extraction as an alternative to intermediate_layer_getter (#4302) (#4418)
[models] Add RegNet Architecture in TorchVision (#4403) (#4530) (#4550)
[ops] Add new masks_to_boxes op (#4290) (#4469)
[ops] Add StochasticDepth implementation (#4301)
[reference scripts] Adding Mixup and Cutmix (#4379)
[transforms] Integration of TrivialAugment with the current AutoAugment Code (#4221)
[transforms] Adding RandAugment implementation (#4348)
[models] Add EfficientNet Architecture in TorchVision (#4293)

Improvements

Various documentation improvements (#4239) (#4251) (#4275) (#4342) (#3894) (#4159) (#4133) (#4138) (#4089) (#3944) (#4349) (#3754) (#4308) (#4352) (#4318) (#4244) (#4362) (#3863) (#4382) (#4484) (#4503) (#4376) (#4457) (#4505) (#4363) (#4361) (#4337) (#4546) (#4553) (#4565) (#4567) (#4574) (#4575) (#4383) (#4390) (#3409) (#4451) (#4340) (#3967) (#4072) (#4028) (#4132)
[build] Add CUDA-11.3 builds to torchvision (#4248)
[ci, tests] Skip some CPU-only tests on CircleCI machines with GPU (#4002) (#4025) (#4062)
[ci] New issue templates (#4299)
[ci] Various CI improvements, in particular putting back GPU testing on windows (#4421) (#4014) (#4053) (#4482) (#4475) (#3998) (#4388) (#4179) (#4394) (#4162) (#4065) (#3928) (#4081) (#4203) (#4011) (#4055) (#4074) (#4419) (#4067) (#4201) (#4200) (#4202) (#4496) (#3925)
[ci] ping maintainers in case a PR was not properly labeled (#3993) (#4012) (#4021) (#4501)
[datasets] Add bzip2 file compression support to datasets (#4097)
[datasets] Faster dataset indexing (#3939)
[datasets] Enable logging of internal dataset instanciations. (#4319) (#4090)
[datasets] Removed copy=False in torch.from_numpy in MNIST to avoid warning (#4184)
[io] Add warning for files with corrupt containers (#3961)
[models, tests] Add test to check that classification models are FX-compatible (#3662)
[tests] Speedup various tests (#3929) (#3933) (#3936)
[models] Allow custom activation in SqueezeExcitation of EfficientNet (#4448)
[models] Allow gradient backpropagation through GeneralizedRCNNTransform to inputs (#4327)
[ops, tests] Add JIT tests (#4472)
[ops] Make StochasticDepth FX-compatible (#4373)
[ops] Added backward pass on CPU and CUDA for interpolation with anti-alias option (#4208) (#4211)
[ops] Small refactoring to support opt mode for torchvision ops (fb internal specific) (#4080) (#4095)
[reference scripts] Added Exponential Moving Average support to classification reference script (#4381) (#4406) (#4407)
[reference scripts] Adding label smoothing on classification reference (#4335)
[reference scripts] Further enhance Classification Reference (#4444)
[reference scripts] Replaced to_tensor() with pil_to_tensor() + convert_image_dtype() (#4452)
[reference scripts] Update the metrics output on reference scripts (#4408)
[reference scripts] Warmup schedulers in References (#4411)
[tests] Add check for fx compatibility on segmentation and video models (#4131)
[tests] Mock redirection logic for tests (#4197)
[tests] Replace set_deterministic with non-deprecated spelling (#4212)
[tests] Skip building torchvision with ffmpeg when python==3.9 (#4417)
[tests] [jit] Make operation call accept Stack& instead Stack* (#63414) (#4380)
[tests] make tests that involve GDrive more robust (#4454)
[tests] remove dependency for dtype getters (#4291)
[transforms] Replaced example usage of ToTensor() by PILToTensor() + ConvertImageDtype() (#4494)
[transforms] Explicitly copying array in pil_to_tensor (#4566) (#4573)
[transforms] Make get_image_size and get_image_num_channels public. (#4321)
[transforms] adding gray images support for adjust_contrast and adjust_saturation (#4477) (#4480)
[utils] Support single color in utils.draw_bounding_boxes (#4075)
[video, documentation] Port the video_api.ipynb notebook to the example gallery (#4241)
[video, io, tests] Added check for invalid input file (#3932)
[video, io] remove deprecated function call (#3861) (#3989)
[video, tests] Removed test_audio_video_sync as it doesn't work as expected (#4050)
[video] Build torchvision with ffmpeg only on Linux and ignore ffmpeg on other platforms (#4413, #4410, #4041)

Bug Fixes

[build] Conda: Add numpy dependency (#4442)
[build] Explicitly exclude PIL 8.3.0 from compatible dependencies (#4148)
[build] More robust version check (#4285)
[ci] Fix broken clang format test. (#4320)
[ci] Remove mentions of conda-forge (#4082)
[ci] fixup '' -> '/./' for CI filter (#4059)
[datasets] Fix download from google drive which was downloading empty files in some cases (#4109)
[datasets] Fix splitting CelebA dataset (#4377)
[datasets] Add support for files with periods in name (#4099)
[io, tests] Don't check transparency channel for pil >= 8.3 in test_decode_png (#4167)
[io] Fix size_t issues across JPEG versions and platforms (#4439)
[io] Raise proper error when decoding 16-bits jpegs (#4101)
[io] Unpinned the libjpeg version and fixed jpeg_mem_dest's size type Wind… (#4288)
[io] deinterlacing PNG images with read_image (#4268)
[io] More robust ffmpeg version query in setup.py (#4254)
[io] Fixed read_image bug (#3948)
[models] Don't download backbone weights if pretrained=True (#4283)
[onnx, tests] Do not disable profiling executor in ...

Assets 2

27 Sep 04:40

malfet

v0.10.1

ca1a620

Minor bugfix release

This release depends on pytorch 1.9.1
No functional changes other than minor updates to CI rules.

Assets 2

15 Jun 14:55

fmassa

v0.10.0

300a8a4

iOS support, GPU image decoding, SSDlite and more

This release improves support for mobile, with new mobile-friendly detection models based on SSD and SSDlite, CPU kernels for quantized NMS and quantized RoIAlign, pre-compiled binaries for iOS available in cocoapods and an iOS demo app. It also improves image IO by providing JPEG decoding on the GPU, and many more.

Highlights

[BETA] New models for detection

SSD and SSDlite are two popular object detection architectures which are efficient in terms of speed and provide good results for low resolution pictures. In this release, we provide implementations for the original SSD model with VGG16 backbone and for its mobile-friendly variant SSDlite with MobileNetV3-Large backbone. The models were pre-trained on COCO train2017 and can be used as follows:

import torch
import torchvision

# Original SSD variant
x = [torch.rand(3, 300, 300), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.ssd300_vgg16(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

# Mobile-friendly SSDlite variant
x = [torch.rand(3, 320, 320), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.ssdlite320_mobilenet_v3_large(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

The following accuracies can be obtained on COCO val2017 (full results available in #3403 and #3757):

Model	mAP	mAP@50	mAP@75
SSD300 VGG16	25.1	41.5	26.2
SSDlite320 MobileNetV3-Large	21.3	34.3	22.1

[STABLE] Quantized kernels for object detection

The forward pass of the nms and roi_align operators now support tensors with a quantized dtype, which can help lowering the memory footprint of object detection models, particularly on mobile environments.

[BETA] JPEG decoding on the GPU

Decoding jpegs is now possible on GPUs with the use of nvjpeg, which should be readily available in your CUDA setup. The decoding time of a single image should be about 2 to 3 times faster than with libjpeg on CPU. While the resulting tensor will be stored on the GPU device, the input raw tensor still needs to reside on the host (CPU), because the first stages of the decoding process take place on the host:

from torchvision.io.image import read_file, decode_jpeg

data = read_file('path_to_image.jpg')  # raw data is on CPU
img = decode_jpeg(data, device='cuda')  # decoded image in on GPU

[BETA] iOS support

TorchVision 0.10 now provides pre-compiled iOS binaries for its C++ operators, which means you can run Faster R-CNN and Mask R-CNN on iOS. An example app on how to build a program leveraging those ops can be found in here.

[STABLE] Speed optimizations for Tensor transforms

The resize and flip transforms have been optimized and its runtime improved by up to 5x on the CPU. The corresponding PRs were sent to PyTorch in pytorch/pytorch#51653, pytorch/pytorch#54500 and pytorch/pytorch#56713

[STABLE] Documentation improvements

Significant improvements were made to the documentation. In particular, a new gallery of examples is available: see here for the latest version (the stable version is not released at the time of writing). These examples visually illustrate how each transform acts on an image, and also properly documents and illustrate the output of the segmentation models.

The example gallery will be extended in the future to provide more comprehensive examples and serve as a reference for common torchvision tasks.

Backwards Incompatible Changes

[transforms] Ensure input type of normalize is float. (#3621)
[models] Use PyTorch smooth_l1_loss and remove private custom implementation (#3539)

New Features

Added iOS binaries and test app (#3582)(#3629) (#3806)
[datasets] Added KITTI dataset (#3640)
[utils] Added utility to draw segmentation masks (#3330, #3824)
[models] Added the SSD & SSDlite object detection models (#3403, #3757, #3766, #3855, #3896, #3818, #3799)
[transforms] Added antialias option to transforms.functional.resize (#3761, #3810, #3842)
[transforms] Add new max_size parameter to Resize (#3494)
[io] Support for decoding jpegs on GPU with nvjpeg (#3792)
[ci, rocm] Add ROCm to builds (#3840) (#3604) (#3575)
[ops, models.quantization] Add quantized version of NMS (#3601)
[ops, models.quantization] Add quantized version of RoIAlign (#3624, #3904)

Improvement

[build] Various build improvements: (#3618) (#3622) (#3399) (#3794) (#3561)
[ci] Various CI improvements (#3647) (#3609) (#3635) (#3599) (#3778) (#3636) (#3809) (#3625) (#3764) (#3679) (#3869) (#3871) (#3444) (#3445) (#3480) (#3768) (#3919) (#3641)(#3900)
[datasets] Improve error handling in make_dataset (#3496)
[datasets] Remove caching from MNIST and variants (#3420)
[datasets] Make DatasetFolder.find_classes public (#3628)
[datasets] Separate extraction and decompression logic in datasets.utils.extract_archive (#3443)
[datasets, tests] Improve dataset test coverage and infrastructure (#3450) (#3457) (#3454) (#3447) (#3489) (#3661) (#3458 (#3705) (#3411) (#3461) (#3465) (#3543) (#3550) (#3665) (#3464) (#3595) (#3466) (#3468) (#3467) (#3486) (#3736) (#3730) (#3731) (#3477) (#3589) (#3503) (#3423) (#3492)(#3578) (#3605) (#3448) (#3864) (#3544)
[datasets, tests] Fix lazy importing for dataset tests (#3481)
[datasets, tests] Fix test_extract(zip|tar|tar_xz|gzip) on windows (#3542)
[datasets, tests] Fix kwargs forwarding in fake data utility functions (#3459)
[datasets, tests] Properly fix dataset test that passes by accident (#3434)
[documentation] Improve the documentation infrastructure (#3868) (#3724) (#3834) (#3689) (#3700) (#3513) (#3671) (#3490) (#3660) (#3594)
[documentation] Various documentation improvements (#3793) (#3715) (#3727) (#3838) (#3701) (#3923) (#3643) (#3537) (#3691) (#3453) (#3437) (#3732) (#3683) (#3853) (#3684) (#3576) (#3739) (#3530) (#3586) (#3744) (#3645) (#3694) (#3584) (#3615) (#3693) (#3706) (#3646) (#3780) (#3704) (#3774) (#3634)(#3591)(#3807)(#3663)
[documentation, ci] Improve the CI infrastructure for documentation (#3734) (#3837) (#3796) (#3711)
[io] remove deprecated function calls (#3859) (#3858)
[documentation, io] Improve IO docs and expose ImageReadMode in torchvision.io (#3812)
[onnx, models] Replace reshape with flatten in MobileNetV2 (#3462)
[ops, tests] Added test for aligned=True (#3540)
[ops, tests] Add onnx test for batched_nms (#3483)
[tests] Various test improvements (#3548) (#3422) (#3435) (#3860) (#3479) (#3721) (#3872) (#3908) (#2916) (#3917) (#3920) (#3579)
[transforms] add __repr__ for transforms.RandomErasing (#3491)
[transforms, documentation] Adds Documentation for AutoAugmentation (#3529)
[transforms, documentation] Add illustrations of transforms with sphinx-gallery (#3652)
[datasets] Remove pandas dependency for CelebA dataset (#3656, #3698)
[documentation] Add docs for missing datasets (#3536)
[referencescripts] Make reference scripts compatible with submitit (#3785)
[referencescripts] Updated all_gather() to make use of all_gather_object() from PyTorch (#3857)
[datasets] Added dataset download support in fbcode (#3823) (#3826)

Code quality

Remove inconsistent FB copyright headers (#3741)
Keep consistency in classes ConvBNActivation (#3750)
Removed unused imports (#3738, #3740, #3639)
Fixed floor_divide deprecation warnings seen in pytest output (#3672)
Unify onnx and JIT resize implementations (#3654)
Cleaned-up imports in test files related to datasets (#3720)
[documentation] Remove old css file (#3839)
[ci] Fix inconsistent version pinning across yaml files (#3790)
[datasets] Remove redundant path.join in Places365 (#3545)
[datasets] Remove imprecise error handling in PhotoTour dataset (#3488)
[datasets, tests] Remove obsolete test_datasets_transforms.py (#3867)
[models] Making protected params of MobileNetV3 public (#3828)
[models] Make target argument in transform.py truly optional (#3866)
[models] Adding some references on MobileNetV3 implementation. (#3850)
[models] Refactored set_cell_anchors() in AnchorGenerator (#3755)
[ops] Minor cleanup of roi_align_forward_kernel_impl (#3619)
[ops] Replace deprecated AutoNonVariableTypeMode with AutoDispatchBelowADInplaceOrView. (#3786, #3897)
[tests] Port tests to use pytest (#3852, #3845, #3697, #3907, #3749)
[ops, tests] simplify get_script_fn (#3541)
[tests] Use torch.testing.assert_close in out test suite (#3886) (#3885) (#3883) (#3882) (#3881) (#3887) (#3880) (#3878) (#3877) (#3875) (#3888) (#3874) (#3884) (#3876) (#3879) (#3873)
[tests] Clean up test accept behaviour (#3759)
[tests] Remove unused masks variable in test_image.py (#3910)
[transforms] use ternary if in resize (#3533)
[transforms] replaced deprecated call to ByteTensor with from_numpy (#3813)
[transforms] Remove unnecessary casting in adjust_gamma (#3472)

Bugfixes

[ci] set empty cxx flags as default (#3474)
[android][test_app] Cleanup duplicate dependency (#3428)
Remove leftover exception (#3717)
Corrected spelling in a TypeError (#3659)
Add missing device info. (#3651)
Moving tensors to the right device (#3870)
Proper error message (#3725)
[ci, io] Pin JPEG version to resolve the size_t issue on windows (#3787)
[datasets] Make LSUN OS agnostic (#3455)
[datasets] Update squeezenet urls (#3581)
[datasets] Add .item() to the target variable in fakedataset.py (#3587)
[datasets] Fix VOC da...

Assets 2

25 Mar 17:51

fmassa

v0.9.1

8fb5838

Dataset bugfixes

Highlights

This minor release bumps the pinned PyTorch version to v1.8.1, and brings a few bugfixes for datasets, including MNIST download not being available.

Bugfixes

fix VOC datasets for 2007 (#3572)
Update EMNIST url (#3567)
Fix redirect behavior of datasets.utils.download_url (#3564)
Fix MNIST download for minor release (#3559)

Assets 2

04 Mar 20:54

fmassa

v0.9.0

01dfa8e

Mobile support, AutoAugment, improved IO and more

This release introduces improved support for mobile, with new mobile-friendly models, pre-compiled binaries for Android available in maven and an android demo app. It also improves image IO and provides new data augmentations including AutoAugment.

Highlights

Better mobile support

torchvision 0.9 adds support for the MobileNetV3 architecture with pre-trained weights for Classification, Object Detection and Segmentation tasks.
It also improves C++ operators so that they can be compiled and run on Android, and we are providing pre-compiled torchvision artifacts published to jcenter. An example application on how to use the torchvision ops on an Android app can be found in here.

Classification

We provide MobileNetV3 variants (including a quantized version) pre-trained on ImageNet 2012.

import torch
import torchvision

# Classification
x = torch.rand(1, 3, 224, 224)
m_classifier = torchvision.models.mobilenet_v3_large(pretrained=True)
# m_classifier = torchvision.models.mobilenet_v3_small(pretrained=True)
m_classifier.eval()
predictions = m_classifier(x)

# Quantized Classification
x = torch.rand(1, 3, 224, 224)
m_classifier = torchvision.models.quantization.mobilenet_v3_large(pretrained=True)
m_classifier.eval()
predictions = m_classifier(x)

The pre-trained models have the following accuracies on ImageNet 2012 val:

Model	Top-1 Acc	Top-5 Acc
MobileNetV3 Large	74.042	91.340
MobileNetV3 Large (Quantized)	73.004	90.858
MobileNetV3 Small	67.620	87.404

Object Detection

We provide two variants of Faster R-CNN with MobileNetV3 backbone pre-trained on COCO train2017. They can be obtained as follows

import torch
import torchvision

# Fast Low Resolution Model
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_320_fpn(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

# Highly Accurate High Resolution Model
x = [torch.rand(3, 300, 400), torch.rand(3, 500, 400)]
m_detector = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn(pretrained=True)
m_detector.eval()
predictions = m_detector(x)

And yield the following accuracies on COCO val 2017 (full results available in #3265):

Model	mAP	mAP@50	mAP@75
Faster R-CNN MobileNetV3-Large 320 FPN	22.8	38.0	23.2
Faster R-CNN MobileNetV3-Large FPN	32.8	52.5	34.3

Semantic Segmentation

We also provide pre-trained models for semantic segmentation. The models have been trained on a subset of COCO train2017, which contains the same 20 categories as those from Pascal VOC.

import torch
import torchvision

# Fast Mobile Model
x = torch.rand(1, 3, 520, 520)
m_segmenter = torchvision.models.segmentation.lraspp_mobilenet_v3_large(pretrained=True)
m_segmenter.eval()
predictions = m_segmenter(x)

# Highly Accurate Mobile Model
x = torch.rand(1, 3, 520, 520)
m_segmenter = torchvision.models.segmentation.deeplabv3_mobilenet_v3_large(pretrained=True)
m_segmenter.eval()
predictions = m_segmenter(x)

The pre-trained models give the following results on the subset of COCO val2017 which contain the same 20 categories as those present in Pascal VOC (full results in #3276):

Model	mean IoU	global pixelwise accuracy
Lite R-ASPP with Dilated MobileNetV3 Large Backbone	57.9	91.2
DeepLabV3 with Dilated MobileNetV3 Large Backbone	60.3	91.2

Addition of the AutoAugment method

AutoAugment is a common Data Augmentation technique that can improve the accuracy of Scene Classification models. Though the data augmentation policies are directly linked to their trained dataset, empirical studies show that ImageNet policies provide significant improvements when applied to other datasets.

In TorchVision we implemented 3 policies learned on the following datasets: ImageNet, CIFA10 and SVHN. The new transform can be used standalone or mixed-and-matched with existing transforms:

from torchvision import transforms

t = transforms.AutoAugment()
transformed = t(image)

transform=transforms.Compose([
    transforms.Resize(256),
    transforms.AutoAugment(),
    transforms.ToTensor()])

Improved Image IO and on-the-fly image type conversions

All the read and decode methods of the io.image package have been updated to:

Add support for Palette, Grayscale Alpha and RBG Alpha image types during PNG decoding.
Allow the on-the-fly conversion of image from one type to the other during read.

from torchvision.io.image import read_image, ImageReadMode

# keeps original type, channels unchanged
x1 = read_image("image.png")

# converts to grayscale, channels = 1
x2 = read_image("image.png", mode=ImageReadMode.GRAY)

# converts to grayscale with alpha transparency, channels = 2
x3 = read_image("image.png", mode=ImageReadMode.GRAY_ALPHA)

# coverts to RGB, channels = 3
x4 = read_image("image.png", mode=ImageReadMode.RGB)

# converts to RGB with alpha transparency, channels = 4
x5 = read_image("image.png", mode=ImageReadMode.RGB_ALPHA)

Python 3.9 and CUDA 11.1

This release adds official support for Python 3.9 and CUDA 11.1 (#3341, #3418)

Backwards Incompatible Changes

[Ops] Change default eps value of FrozenBN to better align with nn.BatchNorm (#2933)
[Ops] Remove deprecated _new_empty_tensor. (#3156)
[Transforms] ColorJitter gets its random params by calling get_params() (#3001)
[Transforms] Change rounding of transforms on integer tensors (#2964)
[Utils] Remove normalize from save_image (#3324)

New Features

[Datasets] Add WiderFace dataset (#2883)
[Models] Add MobileNetV3 architecture:
- Classification Models: (#3354, #3252, #3182, #3242, #3177)
- Object Detection Models: (#3265, #3253, #3223, #3243, #3244, #3248)
- Segmentation Models: (#3276)
- Quantized Models: (#3366, #3323)
[Models] Improve speed/accuracy of FasterRCNN by introducing a score threshold on RPN (#3205)
[Mobile] Add Android gradle project with demo test app (#2897)
[Transforms] Implemented AutoAugment, along with required new transforms + Policies (#3123)
[Ops] Added support of Autocast in all Operators: #2938, #2926, #2922, #2928, #2905, #2906, #2907, #2898
[Ops] Add modulation input for DeformConv2D (#2791)
[IO] Improved io.image with on-the-fly image type conversions: (#3193, #3069, #3024, #2988, #2984)
[IO] Add option to write audio to video file (#2304)
[Utils] Added a utility to draw bounding boxes (#2785, #3296, #3075)

Improvements

Datasets

Concatenate small tensors in video datasets to reduce the use of shared file descriptor (#1795)
Improve testing for datasets (#3336, #3337, #3402, #3412, #3413, #3415, #3416, #3345, #3376, #3346, #3338)
Check if dataset file is located on Google Drive before downloading it (#3245)
Improve Coco implementation (#3417)
Make download_url follow redirects (#3236)
make_dataset as staticmethod of DatasetFolder (#3215)
Add a warning if any clip can't be obtained from a video in VideoClips. (#2513)

Models

Improve error message in AnchorGenerator (#2960)
Disable pretrained backbone downloading if pretrained is True in segmentation models (#3325)
Support for image with no annotations in RetinaNet (#3032)
Change RoIHeads reshape to support empty batches. (#3031)
Fixed typing exception throwing issues with JIT (#3029)
Replace deprecated functional.sigmoid with torch.sigmoid in RetinaNet (#3307)
Assert that inputs are floating point in Faster R-CNN normalize method (#3266)
Speedup RetinaNet's postprocessing (#2828)

Ops

Added eps in the __repr__ of FrozenBN (#2852)
Added __repr__ to MultiScaleRoIAlign (#2840)
Exposing LevelMapper params in MultiScaleRoIAlign (#3151)
Enable autocast for all operators and let them use the dispatcher (#2926, #2922, #2928, #2898)

Transforms

adjust_hue now accepts tensors with one channel (#3222)
Add fill color support for tensor affine transforms (#2904)
Remove torchscript workaround for center_crop (#3118)
Improved error message for RandomCrop (#2816)

IO

Enabling to import read_file and the other methods from torchvision.io (#2918)
accept python bytes in _read_video_from_memory() (#3347)
Enable rtmp timeout in decoder (#3076)
Specify tls cert file to decoder through config (#3289, #3374)
Add UUID in LOG() in decoder (#3080)

References

Add weight averaging and storing methods in references utils (#3352)
Adding Preset Transforms in reference scripts (#3317)
Load variables when --resume /path/to/checkpoint --test-only (#3285)
Updated video classification ref example with new transforms (#2935)

Misc

Various documentation improvements (#3039, #3271, #2820, #2808, #3131, #3062, #3061, #3000, #3299, #3400, #2899, #2901, #2908, #2851, #2909, #3005, #2821, #2957, #3360, #3019, #3124, #3217, #2879, #3234, #3180, #3425, #2979, #2935, #3298, #3268, #3203, #3290, #3295, #3200, #2663, #3153, #3147, #3232)
The documentation infrastructure was improved, in particular the docs are now built on every PR and uploaded to CircleCI (#3259, #3378, #3408, #3373, #3290)
Avoid some deprecation warnings from PyTorch (#3348)
Ensure operators are added in C++ (#2798, #3091, #3391)
Fixed compilation warnings on C++ codebase (#3390)
CI Improvements (#3401, #3329, #2990, #2978, #3189, #3230, #3254, #2844, #2872, #2825, #3144, #3137, #2827, #2848, #2914, #3419, #2895, #2837)
Installation improvements (#3302, #2969, #3113, #3202)
CMake improvemen...

Assets 2

10 Dec 17:10

fmassa

v0.8.2

2f40a48

Python 3.9 support and bugfixes

This minor release bumps the pinned PyTorch version to v1.7.1, and contains some minor improvements.

Highlights

Python 3.9 support

This releases add native binaries for Python 3.9 #3063

Bugfixes

Make read_file and write_file accept unicode strings on Windows #2949
Replaced tuple creation by one acceptable by majority of compilers #2937
Add docs for focal_loss #2979

Assets 2

27 Oct 21:22

seemethere

v0.8.1

45f960c

Added version suffix back to package

Issues resolved:

Cannot pip install torchvision==0.8.0+cu110 - #2912

Assets 2

27 Oct 16:17

fmassa

v0.8.0

291f7e2

Improved transforms, native image IO, new video API and more

This release brings new additions to torchvision that improves support for model deployment. Most notably, transforms in torchvision are now torchscript-compatible, and can thus be serialized together with your model for simpler deployment. Additionally, we provide native image IO with torchscript support, and a new video reading API (released as Beta) which is more flexible than torchvision.io.read_video.

Highlights

Transforms now support Tensor, batch computation, GPU and TorchScript

torchvision transforms are now inherited from nn.Module and can be torchscripted and applied on torch Tensor inputs as well as on PIL images. They also support Tensors with batch dimension and work seamlessly on CPU/GPU devices:

import torch
import torchvision.transforms as T

# to fix random seed, use torch.manual_seed
# instead of random.seed
torch.manual_seed(12)

transforms = torch.nn.Sequential(
    T.RandomCrop(224),
    T.RandomHorizontalFlip(p=0.3),
    T.ConvertImageDtype(torch.float),
    T.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
)
scripted_transforms = torch.jit.script(transforms)
# Note: we can similarly use T.Compose to define transforms
# transforms = T.Compose([...]) and 
# scripted_transforms = torch.jit.script(torch.nn.Sequential(*transforms.transforms))

tensor_image = torch.randint(0, 256, size=(3, 256, 256), dtype=torch.uint8)
# works directly on Tensors
out_image1 = transforms(tensor_image)
# on the GPU
out_image1_cuda = transforms(tensor_image.cuda())
# with batches
batched_image = torch.randint(0, 256, size=(4, 3, 256, 256), dtype=torch.uint8)
out_image_batched = transforms(batched_image)
# and has torchscript support
out_image2 = scripted_transforms(tensor_image)

These improvements enable the following new features:

support for GPU acceleration
batched transformations e.g. as needed for videos
transform multi-band torch tensor images (with more than 3-4 channels)
torchscript transforms together with your model for deployment

Note: Exceptions for TorchScript support includes Compose, RandomChoice, RandomOrder, Lambda and those applied on PIL images, such as ToPILImage.

Native image IO for JPEG and PNG formats

torchvision 0.8.0 introduces native image reading and writing operations for JPEG and PNG formats. Those operators support TorchScript and return CxHxW tensors in uint8 format, and can thus be now part of your model for deployment in C++ environments.

from torchvision.io import read_image

# tensor_image is a CxHxW uint8 Tensor
tensor_image = read_image('path_to_image.jpeg')

# or equivalently
from torchvision.io.image import read_file, decode_image
# raw_data is a 1d uint8 Tensor with the raw bytes
raw_data = read_file('path_to_image.jpeg')
tensor_image = decode_image(raw_data)

# all operators are torchscriptable and can be
# serialized together with your model torchscript code
scripted_read_image = torch.jit.script(read_image)

New detection model

This release adds a pretrained model for RetinaNet with a ResNet50 backbone from Focal Loss for Dense Object Detection, with the following accuracies on COCO val2017:

IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.364
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.558
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.383
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.193
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.400
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.490
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.315
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.506
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.558
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.386
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.595
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.699

[BETA] New Video Reader API

This release introduces a new video reading abstraction, which gives more fine-grained control on how to iterate over the videos. It supports image and audio, and implements an iterator interface so that it can be combined with the rest of the python ecosystem, such as itertools.

from torchvision.io import VideoReader

# stream indicates if reading from audio or video
reader = VideoReader('path_to_video.mp4', stream='video')
# can change the stream after construction
# via reader.set_current_stream

# to read all frames in a video starting at 2 seconds
for frame in reader.seek(2):
    # frame is a dict with "data" and "pts" metadata
    print(frame["data"], frame["pts"])

# because reader is an iterator you can combine it with
# itertools
from itertools import takewhile, islice
# read 10 frames starting from 2 seconds
for frame in islice(reader.seek(2), 10):
    pass
    
# or to return all frames between 2 and 5 seconds
for frame in takewhile(lambda x: x["pts"] < 5, reader.seek(2)):
    pass

Note: In order to use the Video Reader API, you need to compile torchvision from source and make sure that you have ffmpeg installed in your system.
Note: the VideoReader API is currently released as beta and its API can change following user feedback.

Backwards Incompatible Changes

[Transforms] Random seed now should be set with torch.manual_seed instead of random.seed (#2292)
[Transforms] RandomErasing.get_params function’s argument was previously value=0 and is now value=None which is interpreted as Gaussian random noise (#2386)
[Transforms] RandomPerspective and F.perspective changed the default value of interpolation to be BILINEAR instead of BICUBIC (#2558, #2561)
[Transforms] Fixes incoherence in affine transformation when center is defined as half image size + 0.5 (#2468)

New Features

[Ops] Added focal loss (#2784)
[Ops] Added bounding boxes conversion function (#2710, #2737)
[Ops] Added Generalized IOU (#2642)
[Models] Added RetinaNet object detection model (#2784)
[Datasets] Added Places365 dataset (#2610, #2625)
[Transforms] Added GaussianBlur transform (#2658)
[Transforms] Added torchscript, batch and GPU and tensor support for transforms (#2769, #2767, #2749, #2755, #2485, #2721, #2645, #2694, #2584, #2661, #2566, #2345, #2342, #2356, #2368, #2373, #2496, #2553, #2495, #2561, #2518, #2478, #2459, #2444, #2396, #2401, #2394, #2586, #2371, #2477, #2456, #2628, #2569, #2639, #2620, #2595, #2456, #2403, #2729)
[Transforms] Added example notebook for tensor transforms (#2730)
[IO] Added JPEG/PNG encoding / decoding ops
- JPEG (#2388, #2471, #2696, #2725)
- PNG (#2382, #2726, #2398, #2457, #2735)
- decode_image (#2680, #2695, #2718, #2764, #2766)
[IO] Added file reading / writing ops (#2728, #2765, #2768)
[IO] [BETA] Added new VideoReader API (#2683, #2781, #2778, #2802, #2596, #2612, #2734, #2770)

Improvements

Datasets

Added error message if Google Drive download quota is exceeded (#2321)
Optimized LSUN initialization time by only pulling keys from db (#2544)
Use more precise return type for gzip.open() (#2792)
Added UCF101 dataset tests (#2548)
Added download tests on a schedule (#2665, #2675, #2699, #2706, #2747, #2731)
Added typehints for datasets (#2487, #2521, #2522, #2523, #2524, #2526, #2528, #2529, #2525, #2527, #2530, #2533, #2534, #2535, #2536, #2532, #2538, #2537, #2539, #2531, #2540, #2667)

Models

Removed hard coded value in DeepLabV3 (#2793)
Changed the anchor generator default argument to an equivalent one (#2722)
Moved model construction location in resnet_fpn_backbone into after docstring (#2482)
Partially enabled type hints for models (#2668)

Ops

Moved RoIs shape check to C++ (#2794)
Use autocast built-in cast-helper functions (#2646)
Adde type annotations for torchvision.ops (#2331, #2462)

References

[References] Removed redundant target send to device in detection evaluation (#2503)
[References] Removed obsolete import in segmentation. (#2399)

Misc

[Transforms] Added support for negative padding in pad (#2744)
[IO] Added type hints for torchvision.io (#2543)
[ONNX] Export ROIAlign with aligned=True (#2613)

Internal

[Binaries] Added CUDA 11 binary builds (#2671)
[Binaries] Added DEBUG=1 option to build torchvision (#2603)
[Binaries] Unpin ninja version (#2358)
Warn if torchvision imported from repo root (#2759)
Added compatibility checks for C++ extensions (#2467)
Added probot (#2448)
Added ipynb to git attributes file (#2772)
CI improvements (#2328, #2346, #2374, #2437, #2465, #2579, #2577, #2633, #2640, #2727, #2754, #2674, #2678)
CMakeList improvements (#2739, #2684, #2626, #2585, #2587)
Documentation improvements (#2659, #2615, #2614, #2542, #2685, #2507, #2760, #2550, #2656, #2723, #2601, #2654, #2757, #2592, #2606)

Bug Fixes

[Ops] Fixed crash in deformable convolutions (#2604)
[Ops] Added empty batch support for DeformConv2d (#2782)
[Transforms] Enforced contiguous output in to_tensor (#2483)
[Transforms] Fixed fill parameter for PIL pad (#2515)
[Models] Fixed deprecation warning in nonzero for R-CNN models (#2705)
[IO] Explicitly cast to size_t in video decoder (#2389)
[ONNX] Fixed dynamic resize in Mask R-CNN (#2488)
[C++ API] Fixed function signatures for torch::nn::Functional (#2463)

Deprecations

[Transforms] Deprecated dedicated implementations functional_tensor of F_t.center_crop, F_t.five_crop, `F_t.te...

Assets 2

28 Jul 15:04

fmassa

v0.7.0

78ed10c

Mixed precision training, new models and improvements

Highlights

Mixed precision support for all models

torchvision models now support mixed-precision training via the new torch.cuda.amp package. Using mixed precision support is easy: just wrap the model and the loss inside a torch.cuda.amp.autocast context manager. Here is an example with Faster R-CNN:

import torch, torchvision

device = torch.device('cuda')

model = torchvision.models.detection.fasterrcnn_resnet50_fpn()
model.to(device)

input = [torch.rand(3, 300, 400, device=device)]
boxes = torch.rand((5, 4), dtype=torch.float32, device=device)
boxes[:, 2:] += boxes[:, :2]
target = [{"boxes": boxes,
          "labels": torch.zeros(5, dtype=torch.int64, device=device),
          "image_id": 4,
          "area": torch.zeros(5, dtype=torch.float32, device=device),
          "iscrowd": torch.zeros((5,), dtype=torch.int64, device=device)}]

# use automatic mixed precision
with torch.cuda.amp.autocast():
    loss_dict = model(input, target)
losses = sum(loss for loss in loss_dict.values())
# perform backward outside of autocast context manager
losses.backward()

New pre-trained segmentation models

This releases adds pre-trained weights for the ResNet50 variants of Fully-Convolutional Networks (FCN) and DeepLabV3.
They are available under torchvision.models.segmentation, and can be obtained as follows:

torchvision.models.segmentation.fcn_resnet50(pretrained=True)
torchvision.models.segmentation.deeplabv3_resnet50(pretrained=True)

They obtain the following accuracies:

Network	mean IoU	global pixelwise acc
FCN ResNet50	60.5	91.4
DeepLabV3 ResNet50	66.4	92.4

Improved ONNX support for Faster / Mask / Keypoint R-CNN

This release restores ONNX support for the R-CNN family of models that had been temporarily dropped in the 0.6.0 release, and additionally fixes a number of corner cases in the ONNX export for these models.
Notable improvements includes support for dynamic input shape exports, including images with no detections.

Backwards Incompatible Changes

[Transforms] Fix for integer fill value in constant padding (#2284)
[Models] Replace L1 loss with smooth L1 loss in Faster R-CNN for better performance (#2113)
[Transforms] Use torch.rand instead of random.random() for random transforms (#2520)

New Features

[Models] Add mixed-precision support (#2366, #2384)
[Models] Add fcn_resnet50 and deeplabv3_resnet50 pretrained models. (#2086, #2091)
[Ops] Added eps attribute to FrozenBatchNorm2d (#2190)
[Transforms] Add convert_image_dtype to functionals (#2078)
[Transforms] Add pil_to_tensor to functionals (#2092)

Bug Fixes

[JIT] Fix virtualenv and torchhub support by removing eager scripting calls (#2248)
[IO] Fix write_video when floating point FPS is passed (#2334)
[IO] Fix missing compilation files for video-reader (#2183)
[IO] Fix missing include for OSX in video decoder (#2224)
[IO] Fix overflow error for large buffers. (#2303)
[Ops] Fix wrong clamping in RoIAlign with aligned=True (#2438)
[Ops] Fix corner case in interpolate (#2146)
[Ops] Fix the use of contiguous() in C++ kernels (#2131)
[Ops] Restore support of tuple of Tensors for region pooling ops (#2199)
[Datasets] Fix bug related with trailing slash on UCF-101 dataset (#2186)
[Models] Make copy of targets in GeneralizedRCNNTransform (#2227)
[Models] Fix DenseNet issue with gradient checkpoints (#2236)
[ONNX] Fix ONNX implementation ofheatmaps_to_keypoints in KeypointRCNN (#2312)
[ONNX] Fix export of images with no detection for Faster / Mask / Keypoint R-CNN (#2126, #2215, #2272)

Deprecations

[Ops] Deprecate Conv2d, ConvTranspose2d and BatchNorm2d (#2244)
[Ops] Deprecate interpolate in favor of PyTorch's implementation (#2252)

Improvements

Datasets

Fix DatasetFolder error message (#2143)
Change range(len) to enumerate in DatasetFolder (#2153)
[DOC] Fix link URL to Flickr8k (#2178)
[DOC] Add CelebA to docs (#2107)
[DOC] Improve documentation of DatasetFolder and ImageFolder (#2112)

TorchHub

Fix torchhub tests due to numerical changes in torch.sum (#2361)
Add all the latest models to hubconf (#2189)

Transforms

Add fill argument to __repr__ of RandomRotation (#2340)
Add tensor support for adjust_hue (#2300, #2355)
Make ColorJitter torchscriptable (#2298)
Make RandomHorizontalFlip and RandomVerticalFlip torchscriptable (#2282)
[DOC] Use consistent symbols in the doc of Normalize to avoid confusion (#2181)
[DOC] Fix typo in hflip in functional.py (#2177)
[DOC] Fix spelling errors in functional.py (#2333)

IO

Refactor video.py to improve clarity (#2335)
Save memory by not storing full frames in read_video_timestamps (#2202, #2268)
Improve warning when video_reader backend is not available (#2225)
Set should_buffer to True by default in _read_from_stream (#2201)
[Test] Temporarily disable one PyAV test (#2150)

Models

Improve target checks in GeneralizedRCNN (#2207, #2258)
Use Module objects instead of functions for some layers of Inception3 (#2287)
Add support for other normalizations in MobileNetV2 (#2267)
Expose layer freezing option to detection models (#2160, #2242)
Make ASPP-Layer in DeepLab more generic (#2174)
Faster initialization for Inception family of models (#2170, #2211)
Make norm_layer as parameters in models/detection/backbone_utils.py (#2081)
Updates integer division to use floor division operator (#2234, #2243)
[JIT] Clean up no longer needed workarounds for torchscript support (#2249, #2261, #2210)
[DOC] Add docs to clarify aspect ratio definition in RPN. (#2185)
[DOC] Fix roi_heads argument name in doctstring of GeneralizedRCNN (#2093)
[DOC] Fix type annotation in RPN docstring (#2149)
[DOC] add clarifications to Object detection reference documentation (#2241)
[Test] Add tests for negative samples for Mask R-CNN and Keypoint R-CNN (#2069)

Reference scripts

Add support for SyncBatchNorm in QAT reference script (#2230, #2280)
Fix training resuming in references/segmentation (#2142)
Rename image to images in references/detection/engine.py (#2187)

ONNX

Add support for dynamic input shape export in R-CNN models (#2087)

Ops

Added number of features in FrozenBatchNorm2d __repr__ (#2168)
improve consistency among box IoU CPU / GPU calculations (#2072)
Avoid using in header files (#2257)
Make ceil_div __host__ __device__ (#2217)
Don't include CUDAApplyUtils.cuh (#2127)
Add namespace to avoid conflict with ATen version of channel_shuffle() (#2206)
[DOC] Update the statement of supporting torchscript ops (#2343)
[DOC] Update torchvision ops in doc (#2341)
[DOC] Improve documentation for NMS (#2159)
[Test] Add more tests to NMS (#2279)

Misc

Add PyTorch version compatibility table to README (#2260)
Fix lint (#2182, #2226, #2070)
Update version to 0.6.0 in CMake (#2140)
Remove mock (#2096)
Remove warning about deprecated (#2064)
Cleanup unused import (#2067)
Type annotations for torchvision/utils.py (#2034)

CI

Add version suffix to build version
Add backslash to escape
Add workflows to run on tag
Bump version to 0.7.0, pin PyTorch to 1.6.0
Update link for cudnn 10.2 (#2277)
Fix binary builds with CUDA 9.2 on Windows (#2273)
Remove Python 3.5 from CI (#2158)
Improvements to CI infra (#2075, #2071, #2058, #2073, #2099, #2137, #2204, #2264, #2274, #2319)
Master version bump 0.6 -> 0.7 (#2102)
Add test channels for pytorch version functions (#2208)
Add static type check with mypy (#2195, #1696, #2247)

Assets 2

Releases: pytorch/vision

Update dependency on wheels to match version in PyPI

RegNet, EfficientNet, FX Feature Extraction and more

Highlights

New Models

FX-based Feature Extraction

New Data Augmentations

Updated Training Recipes

Backward-incompatible changes

Deprecations

New Features

Improvements

Bug Fixes

Minor bugfix release

iOS support, GPU image decoding, SSDlite and more

Highlights

[BETA] New models for detection

[STABLE] Quantized kernels for object detection

[BETA] JPEG decoding on the GPU

[BETA] iOS support

[STABLE] Speed optimizations for Tensor transforms

[STABLE] Documentation improvements

Backwards Incompatible Changes

New Features

Improvement

Code quality

Bugfixes

Dataset bugfixes

Highlights

Bugfixes

Mobile support, AutoAugment, improved IO and more

Highlights

Better mobile support

Classification

Object Detection

Semantic Segmentation

Addition of the AutoAugment method

Improved Image IO and on-the-fly image type conversions

Python 3.9 and CUDA 11.1

Backwards Incompatible Changes

New Features

Improvements

Datasets

Models

Ops

Transforms

IO

References

Misc

Python 3.9 support and bugfixes

Highlights

Python 3.9 support

Bugfixes

Added version suffix back to package

Issues resolved:

Improved transforms, native image IO, new video API and more

Highlights

Transforms now support Tensor, batch computation, GPU and TorchScript

Native image IO for JPEG and PNG formats

New detection model

[BETA] New Video Reader API

Backwards Incompatible Changes

New Features

Improvements

Datasets

Models

Ops

References

Misc

Internal

Bug Fixes

Deprecations

Mixed precision training, new models and improvements

Highlights

Mixed precision support for all models

New pre-trained segmentation models

Improved ONNX support for Faster / Mask / Keypoint R-CNN

Backwards Incompatible Changes

New Features

Bug Fixes