gpt.pt如何导出onnx? #618

Baiyuetribe · 2024-07-23T02:35:24Z

其他模型导出很容易，就这个不会，希望能新增onnx导出

ZaymeShaw · 2024-07-23T03:26:35Z

同求方案

ZillaRU · 2024-07-23T07:38:10Z

可以分块导出，分成每个decoder layer、LM head、Embedding、sample head导出。

ZillaRU · 2024-07-23T10:02:43Z

#622 之前写的导出脚本。
https://zhuanlan.zhihu.com/p/703240560

Baiyuetribe · 2024-07-23T12:41:47Z

@ZillaRU 哇，gpt竟然能分割导出10个以上的onnx文件，有点小困惑。真的不能再柔和一下吗？

ZaymeShaw · 2024-07-23T15:37:47Z

@ZillaRU 感谢分享思路。想问下是因为需要对应cpp里面的算子实现，所以才需要拆的这么细吗。如果只是往tensorRT方向加速的话，是不是可以适当做一些融合

ZillaRU · 2024-07-24T02:53:37Z

可以参考https://github.com/tpoisonooo/llama.onnx 做导出，chatTTS的gpt其实是一个小型的llama。拆的细是因为每个decoder layer的结构是等同的，单独拆开方便对单个block的优化和测试验证。可以认为优化了单个就是优化了全部。而且做量化的话这样好观察误差来源。

github-actions · 2024-09-08T04:01:30Z

This issue was closed because it has been inactive for 15 days since being marked as stale.

2noise deleted a comment from rose07 Jul 24, 2024

fumiama linked a pull request Jul 28, 2024 that will close this issue

Add scripts to support onnx export #622

Merged

github-actions bot added the stale The topic has been ignored for a long time label Aug 24, 2024

github-actions bot closed this as completed Sep 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpt.pt如何导出onnx? #618

gpt.pt如何导出onnx? #618

Baiyuetribe commented Jul 23, 2024

ZaymeShaw commented Jul 23, 2024

ZillaRU commented Jul 23, 2024

ZillaRU commented Jul 23, 2024

Baiyuetribe commented Jul 23, 2024

ZaymeShaw commented Jul 23, 2024 •

edited

Loading

ZillaRU commented Jul 24, 2024 •

edited

Loading

github-actions bot commented Sep 8, 2024

gpt.pt如何导出onnx? #618

gpt.pt如何导出onnx? #618

Comments

Baiyuetribe commented Jul 23, 2024

ZaymeShaw commented Jul 23, 2024

ZillaRU commented Jul 23, 2024

ZillaRU commented Jul 23, 2024

Baiyuetribe commented Jul 23, 2024

ZaymeShaw commented Jul 23, 2024 • edited Loading

ZillaRU commented Jul 24, 2024 • edited Loading

github-actions bot commented Sep 8, 2024

ZaymeShaw commented Jul 23, 2024 •

edited

Loading

ZillaRU commented Jul 24, 2024 •

edited

Loading