Skip to content

Releases: Xilinx/brevitas

Release version 0.8.0

10 Jan 09:09
Compare
Choose a tag to compare

What's Changed

  • Add support for PyTorch 1.11-1.13.1. Brevitas 0.8 supports PyTorch 1.5.1 to 1.13.1, with 1.10+ suggested.
  • Deprecate support for Python 3.6, 3.7+ is now required.
  • Add support for export to ONNX QCDQ for <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
  • Extend support for export to ONNX QOps to <= int8 quantization, for out of the box execution with onnxruntime or similar backends.
  • Add experimental support for export to torch QCDQ for <= int32 quantization, as an entry point for future MLIR integration with torch-mlir.
  • Add support for QuantRNN, QuantLSTM, w/ support for CIFG, bidirectional layers, shared input-hidden gates, shared quantizers, training-time JIT compilation, and partial export support to ONNX (QONNX and QCDQ).
  • Improve support for zero-point for both weights and activations quantization.
  • New default asymmetric activation quantizer based on percentile rather than min/max.
  • Add more built-in quantizers (symmetric per-channel, asymmetric per-channel, symmetric decoupled per-channel).
  • Simplify interface for activation calibration.
  • Simplify interface for bias correction.
  • Initial support for QuantEmbedding.
  • Deprecate support for XIR and PyXIR export flows.
  • Many bug fixes and minor improvements.

New Contributors

Full Changelog: v0.7.1...v0.8.0

Release version 0.7.1

14 Dec 11:10
Compare
Choose a tag to compare

Fixes

  • Various issues in the arithmetic of QuantTensor
  • Remove a requirement on find_unused_parameters=True in DDP
  • Bias quantization not being enabled if bias is added to a layer post init
  • Sharing per-tensor weight quantizer
  • Improve implementation of zero-point from stats
  • Bias export in QOp ONNX

Full Changelog: v0.7.0...v0.7.1

Release version 0.7.0

29 Oct 11:02
Compare
Choose a tag to compare

Breaking changes

  • DPUv1 specific export flow has been deprecated (since DPUv1 has been deprecated in Vitis AI).
  • Support for PyTorch < 1.5 has been deprecated.
  • The previous implementation of graph quantization has been deprecated.

Fixes

  • Issues between statistics collection in quantized activations and BREVITAS_JIT=1 should be solved.
  • Statistics collection in quantized activations is now done with a Buffer before switching to a learned Parameter, to keep things consistent in distributed training.
  • Custom ONNX functions are now properly registered with PyTorch.
  • Various other minor fixes, see full changelog below.

Features

  • Support for various more operators in QuantTensor.
  • Initial support for post-training quantization through statistics collection, bias correction, and equalization.
  • Initial support for FX-based graph quantization, currently targeting FlexML (an internal toolchain) only.
  • Various other minor enhancements, see full changelog below.

Full Changelog: v0.6.0...v0.7.0

Release version 0.6.0

04 Jun 11:48
Compare
Choose a tag to compare

Breaking changes

  • Quantizers now require to specify a matching proxy class as proxy_class attribute. This is necessary to export more custom quantization techniques through BrevitasONNX. Quantization solvers like WeightQuantSolver already specify their corresponding proxy_class. Any custom quantizer that doesn't inherit from built-in solvers or quantizers will break.

Features

  • New brevitas.fx subpackage with:
    • A backport of torch.fx from version 1.8.1 to earlier versions of PyTorch down to 1.3.1.
    • A generalized tracer (brevitas.fx.value_tracer) that is capable of partially evaluating against the concrete_args without reducing down to constants, as illustrated here pytorch/pytorch#56862. This allows to trace through conditionals and unpacking of tuples as long as representative input data is provided.
    • A symbolic tracer that accounts for Brevitas layers as leaf modules (brevitas.fx.brevitas_symbolic_trace) and its generalized variant (brevitas.fx.brevitas_value_trace).
  • Port existing graph quantization transformations in brevitas.graph to brevitas.fx. Still not ready for easy public consumption, but useful to anyone that knows what they are doing.
  • Rewrite bias export in the FINN ONNX export flow.
  • Add DPURound, with matching STE implementations and wrappers.
  • Add matching implementations for the symbolic ops Quant and DecoupledQuant in the BrevitasONNX export flow.

Bugfixes

  • Fix leftover issues with 16b datatypes not being preserved after quantization during mixed-precision training.
  • Fix per-channel quantization on QuantConvTranspose1d/2d.
  • Fix per-channel quantization whenever two layers with quantized weights share the same quantizer.
  • Fix export from non-CPU devices.

Release version 0.5.1

24 May 12:59
2d65dcd
Compare
Choose a tag to compare

Highlights

Minor release with a bunch of fixes:

  • Fix compatibility with latest onnx (1.9+) by adding a dependency on onnxoptimizer.
  • Fix issues with calls to view on non-contiguous data in recent PyTorch versions by switching to reshape.
  • Fix a bunch of typos in the README.
  • Fix a casting issue that was preventing mixed-precision training from working (it's still generally not reccomended).

Thanks to all the contributors.

Release version 0.5.0

06 May 09:41
Compare
Choose a tag to compare

Highlights

  • Fix an issue with LOG_FP or POWER_OF_TWO restrictions on the scale factors, where the absolute value of the scale was incorrectly being computed before exponentiation. This is a breaking change that restores the (correct) behaviour implemented in earlier releases of Brevitas, and it restores the full accuracy of some pretrained models like QuartzNet. Training with those settings should now be more stable too.
  • Supporting solving enum directives in quantizers that default to None. This restores support for inline enum driven bias quantization that was previously silently failing.
  • Add support for exporting a whole Brevitas model to ONNX (down to all its low level operations) through a call to torch.onnx.export.
  • Initial support for a 'generic' proxy-level ONNX export flow, found under brevitas.export.onnx.generic.
  • Initial support for exporting to Xilinx's XIR format, found under brevitas.export.onnx.vitis_ai.xir.
  • Experimental novel quantized L2,inf weight norm quantizers with learned scale (Int4WeightPerTensorFloatDecoupled and Int4WeightPerTensorFixedPointDecoupled), for high accuracy with per-tensor scaling at low precision, especially on depthwise separable layers.
  • Add the binary and ternary quantizers used in the bnn_pynq examples as standalone quantizers under brevitas.quant.

Release version 0.4.0

15 Mar 19:40
Compare
Choose a tag to compare

Changelog:

  • Add support for __torch_function__ to QuantTensor for supported versions of PyTorch. This finally removes the main barrier in making usage of QuantTensor more ingrained into the overall library.
  • Correctly export operators that are invoked through __torch_function__ and that are invariant to quantization such as torch.nn.functional.max_pool2d in both standard ONNX and PyXIR.

Release version 0.3.1

04 Mar 18:45
Compare
Choose a tag to compare

Changelog:

  • Important bugfix affecting ollection of activation statistics when retraining with BREVITAS_IGNORE_MISSING_KEYS=1. Statistics where not being collected and instead it was using the default baseline value of 1.0 to initialize the scale factor. The problem doesn't affect using load_state_dict(strict=False), which is an alternative to the flag above.
  • Refactor proxies and mixins and simplify a bit the assumptions under which an injector proxy can be created (i.e. always within a quantized layer).
  • Release tutorial on quantizers.

Release version 0.3.0

01 Mar 20:57
e1d6398
Compare
Choose a tag to compare

Release version 0.3.0.

Changelog:

  • Enum and shape solvers are now implemented through extended dependency injectors. This finally makes declarative quantizers self-contained.
  • Reorganize CI.
  • Various smaller features and fixes.

Release version 0.2.1

05 Feb 16:15
Compare
Choose a tag to compare

Release version 0.2.1.

Changelog:

  • Fix a few issues when using QuantTensors w/ zero point.
  • Fix Hadamard layer, the implementation had fallen behind w.r.t QuantLayer and QuantTensor semantics.
  • Make sure that the training flag in a QuantTensor is always set by the Module generating it.