You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to implement inference on a hardware using the Xilinx ap_fixed for a model quantized with alpha='auto'. With alpha=1 it is straightforward. The weights (after applying the quantizer) can be exported directly to the hardware. When alpha='auto' is more challenging. I have not found an explanation on how to compute the weights and the scale, so I have analyzed the code.
This is an extract of quantized_bits for alpha='auto':
My understanding is that z contains the integer representation of the weights that utilize the entire range of the type, that is the scale is optimal. xq are the floating point representation of z. and xq2 the quantized weights in floating point representation that are actually used in the convolution during training. These can exceed the range of the type.
To implement this in the hardware I have to save z as the weights and compute scale which is a constant that have to be applied after the convolution. For alpha='po2' it would be the same but the scale can be applied as a bit shift.
If this is true, it would be nice to have a function that return z and scale as quantized_bits does not.
Thanks
The text was updated successfully, but these errors were encountered:
I'm trying to implement inference on a hardware using the Xilinx ap_fixed for a model quantized with alpha='auto'. With alpha=1 it is straightforward. The weights (after applying the quantizer) can be exported directly to the hardware. When alpha='auto' is more challenging. I have not found an explanation on how to compute the weights and the scale, so I have analyzed the code.
This is an extract of
quantized_bits
for alpha='auto':My understanding is that
z
contains the integer representation of the weights that utilize the entire range of the type, that is the scale is optimal.xq
are the floating point representation ofz
. andxq2
the quantized weights in floating point representation that are actually used in the convolution during training. These can exceed the range of the type.To implement this in the hardware I have to save
z
as the weights and computescale
which is a constant that have to be applied after the convolution. For alpha='po2' it would be the same but the scale can be applied as a bit shift.If this is true, it would be nice to have a function that return
z
andscale
asquantized_bits
does not.Thanks
The text was updated successfully, but these errors were encountered: