-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpt.pt如何导出onnx? #618
Comments
同求方案 |
可以分块导出,分成每个decoder layer、LM head、Embedding、sample head导出。 |
#622 之前写的导出脚本。 |
@ZillaRU 哇,gpt竟然能分割导出10个以上的onnx文件,有点小困惑。真的不能再柔和一下吗? |
@ZillaRU 感谢分享思路。想问下是因为需要对应cpp里面的算子实现,所以才需要拆的这么细吗。如果只是往tensorRT方向加速的话,是不是可以适当做一些融合 |
可以参考https://github.com/tpoisonooo/llama.onnx 做导出,chatTTS的gpt其实是一个小型的llama。拆的细是因为每个decoder layer的结构是等同的,单独拆开方便对单个block的优化和测试验证。可以认为优化了单个就是优化了全部。而且做量化的话这样好观察误差来源。 |
This issue was closed because it has been inactive for 15 days since being marked as stale. |
其他模型导出很容易,就这个不会,希望能新增onnx导出
The text was updated successfully, but these errors were encountered: