-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation for exporting openai/whisper-large-v3 to ONNX #1752
Comments
@mmingo848 You can use: optimum-cli export onnx --help
optimum-cli export onnx --model openai/whisper-large-v3 whisper_onnx and then use ORTModelForSpeechSeq2Seq. Although Feel free to refer to:
Let me know if this documentation is helpful! |
@fxmarty the log:
you can see |
@MrRace Yes it can happen, I would not be worried. We should improve the warning. |
@fxmarty I exported the Whisper ONNX model files using the following command:
Under the export directory However, the |
@MrRace You need |
@fxmarty Thank you very much for your response. However, after following the commands you provided, the following error occurred. How can I fix this error? Thanks again.
optimum : 1.18.0 |
Yes, this was fixed in #1780, which is not yet in a release. Please downgrade to onnx 1.15 or use optimum from source. |
@fxmarty Thanks a lot, it can work. After obtaining the
The above code will raise an error, such as ValueError: Required inputs (['encoder_hidden_states', 'past_key_values.0.decoder.key', 'past_key_values.0.decoder.value', 'past_key_values.0.encoder.key', 'past_key_values.0.encoder.value', 'past_key_values.1.decod ... and so on. |
Hi @MrRace, if you don't want to reimplement the inference code from scratch, I advise you to use https://huggingface.co/docs/optimum/main/en/onnxruntime/package_reference/modeling_ort#optimum.onnxruntime.ORTModelForSpeechSeq2Seq. An example is available there. By default, only I advise you to use https://github.com/lutzroeder/netron if you would like to visualize the ONNX graphs and understand their inputs/outputs. |
@fxmarty Thanks a lot for your reply, yeah, I want to implement it from scratch to better understand the overall inference process. |
Feature request
Hello, I am exporting the OpenAI Whisper-large0v3 to ONNX and see it exports several files, most importantly in this case encoder (encoder_model.onnx & encoder_model.onnx.data) and decoder (decoder_model.onnx, decoder_model.onnx.data, decoder_with_past_model.onnx, decoder_with_past_model.onnx.data) files. I'd like to also be able to use as much as possible from the pipe in the new onnx files:
pipe = pipeline( "automatic-speech-recognition", model=model, tokenizer=processor.tokenizer, feature_extractor=processor.feature_extractor, max_new_tokens=128, chunk_length_s=30, batch_size=16, return_timestamps=True, torch_dtype=torch_dtype, device=device, )
Is there documentation that explains how to incorporate all these different things? I know transformer models are much different in this whole process and I cannot find a clear A -> B process on how to export this model and perform tasks such as quantization, etc. I see I can do the following for the tokenizer with ONNX, but I'd like more insight about the rest I mentioned above (how to use the seperate onnx files & how to use as much as the preexisting pipeline).
processor.tokenizer.save_pretrained(onnx_path)
I also see I can do:
model = ORTModelForSpeechSeq2Seq.from_pretrained( model_id, export=True )
but I cannot find documentation on how to specify where it is exported to, which seem's like I am either missing something fairly simple or it is just not hyperlinked in the documentation.
Motivation
I'd love to see further documentation on the entire export process for this highly popular model. Deployment is significantly slowed due to there not being a easy to find A -> B process for exporting the model and using the pipeline given in the vanilla model.
Your contribution
I am able to provide additional information to make this process easier.
The text was updated successfully, but these errors were encountered: