话说现在用PyTorch分布式训练CTR模型怎么搞速度快啊? #161
guotong1988
started this conversation in
General
Replies: 1 comment
-
CTR肯定是ps架构更好,All Reduce针对NLP/CV这种dense的大模型,不适合稀疏模型。 BTW:我业余时间写了一个基于pytorch的大规模稀疏训练框架,有兴趣可以讨论。 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Parameter Server架构还是All Reduce架构?
CPU还是GPU?
有没有开源代码参考?
用不用改PyTorch源码?
性价比最高的方案是?
Beta Was this translation helpful? Give feedback.
All reactions