-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3Delight capsule support #5791
base: main
Are you sure you want to change the base?
3Delight capsule support #5791
Conversation
Thanks Goran - Capsule support for 3Delight is certainly a very valuable feature. I don't know if this is something you may have tried and already rejected, but I think this might be achievable without resorting to the temp file, if we did something like the following :
If that works, I think it would be more elegant than the temp-file approach, and more performant too. Is this something you'd be willing to take a look at? |
Yes, I can definitely look into the above approach. Thanks so much for the pointers! |
I updated the code to switch the capsule from using a temp NSI file through a procedural to streaming data with a transform node as root. I didn't do any changes in regards to the IPR process, since from the tests I did everything seemed to work as expected (hope I haven't missed anything important there). |
I did a number of tests of both capsule approaches and, interestingly, the results were not one sided. The streaming data approach was significantly faster when a scene had a small number of capsules with a large number of instances. For ex. a scene based on the cow instances test scene that comes with Gaffer adjusted to create 5 million instances was almost two times faster when using streaming data compared to the external procedural approach. Surprisingly, the external procedural approach was both faster and used significantly less memory in the case of a large number of capsules. Using the Moana USD scene that has 32 million instances spread across a large number of capsules, the external procedural approach was around 40-50% faster and used slightly under 90 GB of RAM, with the streaming data approach using over 130 GB of RAM. Since depending on the type of the scene both approaches have their pros and cons, I wanted to propose having a switch in DelightOptions that determines which capsule approach would be used. If that sounds good I can update the code accordingly. |
@johnhaddon a gentle bump regarding the question/proposal above. |
Thanks for the updates @gkocov, and once again sorry for taking so long to get to this. Thanks also for benchmarking the two different approaches against each other.
My preference would be to have only the "streaming" approach without the generation of the temp file. Having a switch that can only be set meaningfully after careful benchmarking of a specific rendering setup seems very "un-3delighty", and temp files can be a bit fragile in practice. If someone really knows what they're doing and needs the extra performance then they could always do that with one DelightRender to output a Mostly the benchmarking makes me think that there must be opportunities to optimise the streaming path further. In Arnold we had problems sharing the InstanceCache and ShaderCache between procedurals and the main scene, but in 3Delight I don't think that problem should exist. So we should be able to have the ProceduralRenderer just share the caches with the main renderer, which could yield quite substantial improvements in some cases.
It would be great to include some unit tests with this. Minimally I think we'd want to demonstrate that rendering a procedural generates a transform with the children underneath, and that rendering the same procedural twice results in that same transform being instanced. If we can share the InstanceCache and ShaderCache then something demonstrating that working would be great too.
The potentially tricky thing I didn't see covered was the cleanup of the children when the procedural is deleted during an InteractiveRender. There is a flag that causes NSIDelete to recursively delete things and that might do the job. But we need to be careful only to delete things truly owned by the procedural, and not things that came out of a shared InstanceCache or ShaderCache. I'm not sure how we'd achieve test coverage for this, but it would be invaluable if there was a way of doing it. |
Keeping the streaming approach sounds good. I did some tests and we are indeed creating duplicate objects with the current code due to DelightProceduralRenderer creating a new InstanceCache each time it runs. I'll look into using the InstanceCache (and potentially ShaderCache and AttributesCache) from the main scene instead of creating new ones inside DelightProceduralRenderer. Once I have the above working I'll definitely create unit tests. Regarding the cleanup of the children, I was about to look into this, but (at least with the current code) the InteractiveRenderer behaved correctly without having explicit cleanup of the procedural's children in all the tests I did. Once I update the code to re-use the main scene's InstanceCache I'll redo the tests again to double check that the InteractiveRenderer still works as expected. |
I would expect them to disappear from the rendered image, because they're not accessible from the NSI_SCENE_ROOT node. But I'd still expect them to exist as nodes, using memory. |
@johnhaddon I've added support for recursive delete of capsules and also capsules now use a shared cache with the main scene. I was about to start on the test, but I couldn't find a way to create capsules with the IECore modules. I checked the IECoreArnoldTest suite, hoping I could find some examples there, but unfortunately I couldn't find any capsule tests there as well. I'll look into the GafferXYZ tests for any capsule examples next. In the meantime I've attached a simple Gaffer test scene and its NSI output showcasing that even though there are three objects in the Gaffer scene - a plane mesh and two capsules that the plane is also connected to - there is only one mesh object output in the NSI file feeding three NSI transforms representing the three Gaffer objects. |
Yep, Capsules are a Gaffer thing, one layer higher up in the software stack. But a Capsule is just a Procedural that sources its data from Gaffer, so you can test with a Procedural and that will be entirely valid. There's an example here in the IECoreArnold tests : https://github.com/GafferHQ/gaffer/blob/main/python/IECoreArnoldTest/RendererTest.py#L2993-L3048. |
Great! Thanks @johnhaddon ! That was super helpful! I've added namespaces for the objects inside a capsule (without it we could run into conflicts with same named objects when multiple capsules were present in the scene) and I've also added a test checking that the expected transforms for the capsule are created and that the re-use of instances due to the global instance cache works. |
Also, really happy to say that the capsule namespaces and the global caches made a huge difference on the Moana test scene and now we are roughly on par with the external procedural approach in regards to the render times and memory usage. |
Generally describe what this PR will do, and why it is needed
Checklist