You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is it expected behavior that the quantization-aware training in QKeras is much slower than normal training in Keras? And if so, out of interest, where does the overhead come from? From the quantization-dequantization operation?
Thank you for your help!
The text was updated successfully, but these errors were encountered:
The quantization operations will add a little more computation for every inference, but it should not be significant. Most of the slowness in training is expected to come from the increased number of epochs you may need to train for quantized models, especially if going to low precisions and the training becomes unstable.
What sorts of slow-downs are you experiencing? Do you have any examples / data?
Thank you for your response. I have prepared a Colab notebook for a similar setup as I intended to work on (mapping time-sequence X to Y). With the Keras implementation the training takes 0.22 s and with Qkeras 6 s per step. If I remove the GRU, there is still a difference, but it is not as big as before (50 ms vs 90 ms per step). I assume that the main difference for the slower training of the DNN with GRU comes from a non-cuDNN optimized implementation of GRU (e.g., because of quantized activiations). On the Tensorflow description of the GRU layer, they also mention, that the fast cuDNN GRU is only used for the standard configuration.
It seems that slow training is not a problem of QKeras but of non-standard GRUs in general.
Hi everyone
Is it expected behavior that the quantization-aware training in QKeras is much slower than normal training in Keras? And if so, out of interest, where does the overhead come from? From the quantization-dequantization operation?
Thank you for your help!
The text was updated successfully, but these errors were encountered: