-
Notifications
You must be signed in to change notification settings - Fork 455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MPT onnx and ORT support #1161
Conversation
Hi @jiqing-feng , thank you for the PR! I'm personally not in favor of supporting custom models in Optimum that otherwise use What is your opinion @echarlaix @JingyaHuang @michaelbenayoun ? I propose #1166 #1143 in this regard, see the API here: https://moon-ci-docs.huggingface.co/docs/optimum/pr_1166/en/exporters/onnx/usage_guides/export_a_model#custom-export-of-transformers-models |
The documentation is not available anymore as the PR was closed or merged. |
@fxmarty @jiqing-feng For the ease of maintenance, Optimum shall only support configs of transformers native models. For those with btw, mpt sounds like a good example to test a bit with what you suggest @fxmarty |
@jiqing-feng Yes, once mpt is merged & released in transformers for sure we can merge this PR in Optimum! My suggestion is more for models with custom modeling code (as mpt is currently). |
Hi @fxmarty . Since MPT has been merged to HF models, see 24629. Could we merge this PR for supporting MPT onnx config? cc @JingyaHuang |
Hi @jiqing-feng, for sure we can merge once there is a transformers release! Is the KV cache layout fine? |
Yes! The KV cache layout is the same as most CLM models like llama. We can wait for the release. |
Hi @fxmarty @JingyaHuang . I think we can move forward on this PR since the recently released version of transformers contained MPT model. I see that some checks have failed. Would you please tell me where and what kind of tests should I add? Thx! |
Hi @echarlaix @fxmarty @JingyaHuang Could we merge this PR? I see that the failed checks are not related to my changes and the mpt model has been released in the latest released version of transformers. Thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiqing-feng Thank you for the addition and apologize for the delay!
Hi @fxmarty @echarlaix
Relate to 1101. This PR enables MPT models onnx config to support generating dummy inputs.
The shape arrangement of
past_key_values
may be different across models. We can use the member variablesequence_length
ofDummyPastKeyValuesGenerator
to identify the sequence length ofpast_key_values
.Would you please help me review it? Thanks
cc @changwangss