You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Has anyone converted the FastSpeech2 or Tacotron models to full-integer quantized tflite models?
My representative dataset generator for FastSpeech2 is returning a floating point exception during conversion, any ideas about what I might be doing wrong? This seems close to enabling the int8x8 tflite model. I need to run on an ARM Ethos-U55 NPU processor, where floating-point support is limited. I don't care so much about quantization error for now, rather profiling it on the U55 once I have a tflite model. We can use tricks like QAT if we need to reduce the quantization error later.
def representative_dataset():
# Provide a set of input samples that are representative of the data
# the model will be dealing with during inference
for _ in range(1): # Adjust the number of samples as needed
input_ids = tf.convert_to_tensor(np.random.randint(0, 1, size=(1, 50), dtype=np.int32), dtype=tf.int32) # Example input shape
speaker_ids = tf.convert_to_tensor(np.array([1], dtype=np.int32), dtype=tf.int32)
speed_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
f0_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
energy_ratios = tf.convert_to_tensor(np.array([1.0], dtype=np.float32), dtype=tf.float32)
print(input_ids.shape)
yield ([input_ids, speaker_ids, speed_ratios, f0_ratios, energy_ratios])
Has anyone converted the FastSpeech2 or Tacotron models to full-integer quantized tflite models?
My representative dataset generator for FastSpeech2 is returning a floating point exception during conversion, any ideas about what I might be doing wrong? This seems close to enabling the int8x8 tflite model. I need to run on an ARM Ethos-U55 NPU processor, where floating-point support is limited. I don't care so much about quantization error for now, rather profiling it on the U55 once I have a tflite model. We can use tricks like QAT if we need to reduce the quantization error later.
The text was updated successfully, but these errors were encountered: