You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I propose adding a tutorial on implementing and using minGRU (minimal Gated Recurrent Unit) to the PyTorch tutorials. This addition would provide valuable insights into efficient sequence modeling techniques for the PyTorch community.
Efficiency: Up to 1324x faster than standard GRU for 4096-token sequences, with comparable accuracy.
Competitive Performance: Matches state-of-the-art models like Mamba in language modeling and reinforcement learning.
If you guys like this idea, I'm ready to jump in! I could have a PR ready as soon as tomorrow.
I'm thinking of contributing a tutorial on how to use or train minGRU for language modeling
🚀 Describe the improvement or the new tutorial
I propose adding a tutorial on implementing and using minGRU (minimal Gated Recurrent Unit) to the PyTorch tutorials. This addition would provide valuable insights into efficient sequence modeling techniques for the PyTorch community.
Benefits for PyTorch users:
Paper
were rnns all we need
Existing tutorials on this topic
No response
Additional context
If you guys like this idea, I'm ready to jump in! I could have a PR ready as soon as tomorrow.
I'm thinking of contributing a tutorial on how to use or train minGRU for language modeling
@svekars @albanD
The text was updated successfully, but these errors were encountered: