Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpt.pt如何导出onnx? #618

Closed
Baiyuetribe opened this issue Jul 23, 2024 · 7 comments · Fixed by #622
Closed

gpt.pt如何导出onnx? #618

Baiyuetribe opened this issue Jul 23, 2024 · 7 comments · Fixed by #622
Labels
stale The topic has been ignored for a long time

Comments

@Baiyuetribe
Copy link

其他模型导出很容易,就这个不会,希望能新增onnx导出

@ZaymeShaw
Copy link
Contributor

同求方案

@ZillaRU
Copy link
Contributor

ZillaRU commented Jul 23, 2024

可以分块导出,分成每个decoder layer、LM head、Embedding、sample head导出。

@ZillaRU
Copy link
Contributor

ZillaRU commented Jul 23, 2024

#622 之前写的导出脚本。
https://zhuanlan.zhihu.com/p/703240560

@Baiyuetribe
Copy link
Author

@ZillaRU 哇,gpt竟然能分割导出10个以上的onnx文件,有点小困惑。真的不能再柔和一下吗?

@ZaymeShaw
Copy link
Contributor

ZaymeShaw commented Jul 23, 2024

@ZillaRU 感谢分享思路。想问下是因为需要对应cpp里面的算子实现,所以才需要拆的这么细吗。如果只是往tensorRT方向加速的话,是不是可以适当做一些融合

@ZillaRU
Copy link
Contributor

ZillaRU commented Jul 24, 2024

可以参考https://github.com/tpoisonooo/llama.onnx 做导出,chatTTS的gpt其实是一个小型的llama。拆的细是因为每个decoder layer的结构是等同的,单独拆开方便对单个block的优化和测试验证。可以认为优化了单个就是优化了全部。而且做量化的话这样好观察误差来源。

@2noise 2noise deleted a comment from rose07 Jul 24, 2024
@fumiama fumiama linked a pull request Jul 28, 2024 that will close this issue
@github-actions github-actions bot added the stale The topic has been ignored for a long time label Aug 24, 2024
Copy link
Contributor

github-actions bot commented Sep 8, 2024

This issue was closed because it has been inactive for 15 days since being marked as stale.

@github-actions github-actions bot closed this as completed Sep 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale The topic has been ignored for a long time
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants