-
-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Juice for Deep Reinforcement Learning #155
Comments
Hey, it should definitely work and PRs are very welcome! I have to look into the details again, could you provide a simple example flow that yields your panic? The reference does not show a line number. At least a backtrack would be very helpful! |
Here's the crash stack:
|
I've tried to dive into the source code a bit, and so far I'm very confused. Not sure if this is the best way to ask specific technical questions (let me know if you prefer a different way), but here are some:
Sorry if this looks like criticism -- I've tried to capture my thoughts of a person who is fresh to the code. Hopefully you find it useful :) |
For most solvers, this seems reasonable, and worst case, one could run multiple single pass runs within the minibatch. So not much pain with a bit of gain :)
From the top of my head, I don't quite remember where that is, could you add a reference?
Imho it's exactly that, and by adding to the
This is indeed a pitfall,
The whole
Not convinced this would help much. I.e. currently there is only sequential, but the idea of making everything a layer does maintain flexibility, i.e. what would be needed is a
No, the criticism is very much needed and very very much appreciated. Thanks for taking the time to dive into the code! |
OK, let me take a stab at some renaming. I'm thinking of this:
I think it also makes sense to move However, this is just refactor, and the core issue in this comment persists. |
Not sure this is the best idea, since the userfacing part should remain named I don't think there is a good place for |
I think My approach would be to simulate a minibatch. Take a sample, duplicate it internally to match the size of the true minibatch cardinality and pretend it was a minibatch. This is the dirty variant. My time seems to be rather over-committed lately so I did not have the time yet for a deep dive into the issue. Happy to discuss PRs though, I just want to urge you to first write an example down how you'd expect to use the API, create a draft and then we can discuss based on that how to best achieve an impl that can provide the desired features, if that sounds good to you : ) |
Well, Layers are not exactly userfacing. User mostly deals with LayerConfigs. The only place when user can get their hands on a layer is through
Another possibility is to put it into this new
Yeah, that's what I do now. It's a 32x increase in compute cost (in order to get 1 prediction I need to run the full batch, which is 32 vectors), but I don't see an easy way around it, unfortunately. I have some thoughts on how to maybe improve this, but this will require rearchitecting the framework (specifically, store buffers not in layers but in some external context).
Sure, I don't want to land a big PR on you without aligning on the approach first. I value your and my time :) Let me know if you want me to create a separate tracking issue for the renaming we're discussing -- it seems we're discussing many things in parallel, so it becomes confusing. |
If this solves the practical issue for reinforcement learning without complicating the regular use case, go for it :) |
Yeah, I'm still investigating the code. The simultaneous learning/using is probably not the biggest blocker right now. I just recently discovered I probably will need asynchronous multi-task learning, so I'm trying to understand whether it's supported (I don't think it is, out of the box) or how hard it'll be to add it. I'm currently ingesting the code and creating a technical description of Juice internals, will share it when it's done. From there, it'll be easier to understand if it's possible/acceptable to adapt it for RL purposes. I don't have a ton of spare time myself (have a full-time job and a family), so progress is slow. |
I have document (still in progress) here: https://docs.google.com/document/d/1arr0trMv6lDJmy9ejThlnoSri8BGgUJ64LVA0IzBTXE. It's open for viewing for all; can add collaborators if requested (I think it'll require a gmail account). |
OK, I think the document is ready (same link as above). I would appreciate a review (the link gives read-only access, I can only share for commenting to a gmail account so IM me if you're interested). Some additional notes and thoughts that I had while working on the doc:
|
👍
This is indeed true, there is only a vague juice book.
Happy to merge these, I think they are indeed an artifact of the past and were a very early impl detail.
I had the same struggle when I started with
It is. The entirety of juice is built around the API of cudnn, which still is the defacto standard lib. So this abstraction leaks the design of cudnn, but since that's the only fully covering impl (native is mostly on par, with a few exceptions) this will only change once a OpenCL impl is complete.
Which ones are you refering to? The inner
The whole solver impl is pretty much trash and needs to be redesigned. It's very inflexible and feels like an addon to the rest. If you have input on how to achieve it, happy to discuss. It's not as trivial to align it with the rest of the API. Thanks again for writing up your thoughts. |
I just send a PR #159. As stated in the PR description, this is not intended for merging, more like a suggestion on how to improve layer framework and a conversation starter. I know we discussed that a short doc would be good before investing too much into code, but I found that actually writing the code is the best way for me to ensure my approach works and handles all cases. I'm not super attached to that PR either -- if we think it's not the right way, I happy to drop it and work on another iteration 🙂 The primary motivating factors for this design are:
An interesting outcome of 1. above is that my design makes batch size very explicit and this uncovers some inconsistencies (at least I think they are inconsistencies) in layer structuring, like loss layers having outputs (see the comment in |
Sorry for the delay, I should get around to it later today 🤞 |
Gentle ping on this one. |
Sorry for the long delay, it's been an eventful 3 weeks. For me, I am already sold at just the fact that batch sizes can be adjusted as needed. This needs to be verified to be compatible with the current cudnn API, but iirc, it should work with a few small tweaks. I'd say let's continue the code related discussions in the PR -> #159 |
Hello,
I'm investigating a potential use of Juice framework for deep reinforcement learning (I'm also learning the RL and deep learning as I go, so apologies for potentially newbie questions). RL requires simultaneous learning and using the net for predictions. I've found several issues which I'm not sure are design decisions or implementation shortcuts:
Solver
has anetwork()
method, comment for which says that "This is the recommended method to get a usable trained network." However, you can't callforward()
on it, since it requires amut
ref.I can probably work around 1 (like artificially creating a batch by replicating a single input vector) and 2 (by using
mut_network()
), but it doesn't look right.Is this something that can (should?) be fixed in the implementation? I'm happy to provide PRs (but will likely require technical guidance).
Thank you!
The text was updated successfully, but these errors were encountered: