Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-closure Prodigy? #12

Open
madman404 opened this issue Apr 9, 2024 · 4 comments
Open

Non-closure Prodigy? #12

madman404 opened this issue Apr 9, 2024 · 4 comments

Comments

@madman404
Copy link

I've gotten a lot of use out of prodigy over the past few months, and I'd love if I could take advantage of schedule-free optimization alongside it. I see that there is a reference example in the repo, but it uses closure. The problem is that the training loop I am using is not set up for closure and I am not very smart, and I don't understand nearly enough about the math here to create a non-closure implementation. Would it be possible to provide one like what has been provided for AdamW and SGD?

@adefazio
Copy link
Contributor

adefazio commented Apr 9, 2024

We are working on an implementation, we are just finalizing the theory now. We will need to verify that it works correctly on a large set of problems before we release it.

@sdbds
Copy link

sdbds commented Apr 17, 2024

I'm waiting for this too.

@madman404
Copy link
Author

Is the new-ish wrapper compatible with prodigy, or is there still work being done on a dedicated combination of the two? I understand the wrapper is meant for arbitrary optimizers, but prodigy is also a little more... involved than usual.

@adefazio
Copy link
Contributor

adefazio commented Aug 9, 2024

I've been experimenting with combining the two but I'm not seeing performance on par with using Prodigy by it's self yet. I'm still unsure why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants