Highlights

Fix an issue with LOG_FP or POWER_OF_TWO restrictions on the scale factors, where the absolute value of the scale was incorrectly being computed before exponentiation. This is a breaking change that restores the (correct) behaviour implemented in earlier releases of Brevitas, and it restores the full accuracy of some pretrained models like QuartzNet. Training with those settings should now be more stable too.
Supporting solving enum directives in quantizers that default to None. This restores support for inline enum driven bias quantization that was previously silently failing.
Add support for exporting a whole Brevitas model to ONNX (down to all its low level operations) through a call to torch.onnx.export.
Initial support for a 'generic' proxy-level ONNX export flow, found under brevitas.export.onnx.generic.
Initial support for exporting to Xilinx's XIR format, found under brevitas.export.onnx.vitis_ai.xir.
Experimental novel quantized L2,inf weight norm quantizers with learned scale (Int4WeightPerTensorFloatDecoupled and Int4WeightPerTensorFixedPointDecoupled), for high accuracy with per-tensor scaling at low precision, especially on depthwise separable layers.
Add the binary and ternary quantizers used in the bnn_pynq examples as standalone quantizers under brevitas.quant.

Provide feedback