Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GafferCycles : Refactor the problematic reset code #5101

Conversation

boberfly
Copy link
Collaborator

@boberfly boberfly commented Jan 21, 2023

Generally describe what this PR will do, and why it is needed

  • Remove and refactor the problematic reset code as we now only allow scene/session parameters to be set before rendering starts.
  • This will also avoid a mutex lock issue with Windows, and replaces the need for this PR: GafferCycles : Remove problematic mutex lock #5100

Related issues

  • Windows has deadlocks on scene/session resets due to mutex handling (a reset could delete the mutex and try to unlock a new one).

Dependencies

Breaking changes

Checklist

  • I have read the contribution guidelines.
  • I have updated the documentation, if applicable.
  • I have tested my change(s) in the test suite, and added new test cases where necessary.
  • My code follows the Gaffer project's prevailing coding style and conventions.

This will probably need a bit of testing before I'd be confident in merging, this could break interactive scene updating.

@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch 3 times, most recently from 4d532ee to b71defb Compare January 21, 2023 13:40
Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, it definitely seems like this is something we need to simplify. I don't really understand what we're doing here still though. Could you step back a bit and explain clearly what the problem is (what constraints Cycles is placing on us), and what our strategy is?

Implementationwise, I wonder if these are possible and would help with clarity?

  • Could we construct the session only once, the first time render() is called, so everything is deferred till that moment, instead of trying to reset things that were done earlier?
  • Could we query Cycles for what is changeable and what is not, so we're not duplicating information?

Some test coverage for all this would be pretty useful too...

{
// Make sure to apply any scene/session parameters first so we only hard-reset once
// before any rendering starts as we cannot set these after rendering starts.
if( m_renderState != RENDERSTATE_RENDERING )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention of this check? That we will only ever call updateSessionParams() once? If so, how does that interact with the code in progress() that sets m_renderState = RENDERSTATE_STOPPED;. The comment there seems to imply that will happen whenever an interactive render converges and stops sampling.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alex, it definitely seems like this is something we need to simplify. I don't really understand what we're doing here still though. Could you step back a bit and explain clearly what the problem is (what constraints Cycles is placing on us), and what our strategy is?

Implementationwise, I wonder if these are possible and would help with clarity?

Yep, so the goals of this draft PR (keyword draft! :) ) was to tackle the windows deadlock from @ericmehl 's #5100, and the comments in there probably explain my beginning process the best (copy-pasted):

There was talks/suggestions before that we could defer Cycles' scene/session creation to the first call to render() so we don't have the reset at all, but the current code as it stands needs the first reset so that the desired scene/session parameters can be set before rendering starts. I've not started this yet and the annoying thing is any calls to option() needs to be caching all of these somewhere without scene/session existing yet, and maybe other things I'm not thinking of.

  • Could we construct the session only once, the first time render() is called, so everything is deferred till that moment, instead of trying to reset things that were done earlier?

I eventually want to move it to render() if possible but I still think it'll be a big headache before that. The biggest one is the need for scene to exist when converting objects and shaders eg. scene->shader_manager usage, and I know when we get VDB's to work again, scene->image_manager is needed as well. This PR just has better intentions where we only allow all session/scene parameter changes before rendering starts, and not the possibility of those changing while interactive rendering like it was intended in the past. The warning in option() was just a print-out and nothing more.

  • Could we query Cycles for what is changeable and what is not, so we're not duplicating information?

https://github.com/blender/cycles/blob/master/src/session/session.h#L90 - this is the best I've found, the omission of some parameters in the modified() call.

Some test coverage for all this would be pretty useful too...

After draft status no worries

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intention of this check? That we will only ever call updateSessionParams() once? If so, how does that interact with the code in progress() that sets m_renderState = RENDERSTATE_STOPPED;. The comment there seems to imply that will happen whenever an interactive render converges and stops sampling.

Alright going over the enum states I've just now made a bool m_rendering, the states were needed pre-CyclesX when a finished interactive render would cease to update, but that's not needed anymore.

@@ -2500,6 +2488,10 @@ IECore::InternedString g_textureBlurGlossyOptionName( "cycles:texture:blur_gloss
IECore::InternedString g_textureUseCustomCachePathOptionName( "cycles:texture:use_custom_cache_path" );
IECore::InternedString g_textureCustomCachePathOptionName( "cycles:texture:custom_cache_path" );

// Some of the session parameters that can be changed in interactive rendering
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of? Does that mean there are others? Is it possible to query what is changeable and what is not via the Cycles API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that I know of, the best place is the modified function's omitted ones to flag:
https://github.com/blender/cycles/blob/master/src/session/session.h#L90

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, could we just build a new set of session parameters, and then call modified() to see if we can apply them? Rather than duplicating the logic?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright I've got something in the OPTION macro to do the error print now which simplifies things quite a bit.

src/GafferCycles/IECoreCyclesPreview/Renderer.cpp Outdated Show resolved Hide resolved
OPTION(bool, m_sessionParams, g_useProfilingOptionName, use_profiling);
OPTION(bool, m_sessionParams, g_useAutoTileOptionName, use_auto_tile);
OPTION(int, m_sessionParams, g_tileSizeOptionName, tile_size);
OPTION(bool, m_sessionParams, g_useResolutionDividerOptionName, use_resolution_divider);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing new options in this commit doesn't help with clarity.

Comment on lines 3377 to 3367
// Check if an OSL shader exists & set the shadingsystem
if( m_sessionParams.shadingsystem == ccl::SHADINGSYSTEM_SVM && m_shaderCache->hasOSLShader() )
{
IECore::msg( IECore::Msg::Warning, "CyclesRenderer", "OSL Shader detected, forcing OSL shading-system (CPU-only)" );
m_sessionParams.shadingsystem = ccl::SHADINGSYSTEM_OSL;
m_sceneParams.shadingsystem = ccl::SHADINGSYSTEM_OSL;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is useful. It doesn't stop someone starting a render in SVM and then adding an OSL shader later on. And it doesn't look like it actually switches to CPU mode either? I think we would be better off not trying to be clever here, and just rely on the warning in updateSceneObjects().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The call to init() will use the fallback to CPU code, and later when I think 3.5 lands in with Optix allowing it, we can keep the device choosing code in init() was the intention.

Not too fussed about this call to remove the cleverness, I wrote it originally when SVM was the default and I didn't want the shaderball to be useless.

Comment on lines 3361 to 3333
IECore::msg( IECore::Msg::Error, "CyclesRenderer", "OSL Shader detected, this will cause problems in a running interactive render" );
m_shaderCache->removeOSLShaders();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What problems will this cause? This seems like something we should definitely have some test coverage for.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I remember it crashed it but I forget. I can make a test for it sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No crashing now with me playing with the viewport and added a unit test for it also, so I removed the removeOSLShaders();

m_sceneParams.shadingsystem = ccl::SHADINGSYSTEM_OSL;
}

// If anything changes in scene or session, we reset.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description of this work, I assumed we weren't doing any resets, so I'm confused by this comment. I don't think we've ever clearly written down what constraints we're dealing with (from the Cycles and Renderer APIs), what we're trying to achieve, and what our strategy is. Can you do that?


init();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not doing resets any more, why does init() still make a new Session, with the trick to reuse the memory address of the old one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only does it before rendering starts and not during rendering. There's still roadblocks to consider before moving the init() to render().

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the tricky to reuse the memory address is redundant now, and can be removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there's a way to avoid these tags I think so:
https://github.com/GafferHQ/gaffer/blob/main/src/GafferCycles/IECoreCyclesPreview/Renderer.cpp#L2009

A forward declare of CyclesRender and giving it a session or scene getter didn't work, not too sure on what other solutions to have here - is there another way to find out if an object got a transform or attribute change from the Renderer?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I see that nodes have a is_modified() now which can defer to the CyclesRenderer, that's good.

@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch from b71defb to fcd9dec Compare January 26, 2023 04:48
@boberfly boberfly changed the title GafferCycles : Remove and refactor the problematic reset code GafferCycles : Refactor the problematic reset code Jan 26, 2023
@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch 3 times, most recently from a255ab2 to 293d760 Compare January 27, 2023 13:09
@boberfly boberfly marked this pull request as ready for review January 27, 2023 13:09
@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch from 293d760 to 8307715 Compare April 1, 2023 00:00
@boberfly boberfly changed the base branch from main to 1.2_maintenance April 1, 2023 00:05
@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch from 8307715 to a984209 Compare April 1, 2023 00:52
@boberfly
Copy link
Collaborator Author

Hey John I've made a few changes since we last looked at it, there's 2 unit tests which prod at what this MR ultimately solves (those pesky params on Scene/Session classes before rendering).

From the description of this work, I assumed we weren't doing any resets, so I'm confused by this comment. I don't think we've ever clearly written down what constraints we're dealing with (from the Cycles and Renderer APIs), what we're trying to achieve, and what our strategy is. Can you do that?

A refresher as it's been awhile since I looked at this myself and I'll try to condense/simplify:

Some scene/session parameters cannot be changed once rendering starts is the constraint, but scene/session needs to be created before those options get set via the Gaffer render interface. This code refactors the previous assumption that we could change these post-live rendering by re-creating session/scene and transplanting stuff back in aka the reset codepath. In doing so we avoid many problems, with one being that the mutex for the scene doesn't get deleted while it is locked which causes deadlocks on windows, as well as things like #5234 due to stale pointers to shaders/lights that existed in the previous scene but would be re-created and have different pointers post-reset.

Some of the reset code needs to stay but only so that we can set scene/session parameters before rendering starts and before any scene-related objects can be placed into cycles itself, avoiding any dodgy transplanting or stale pointer fixing. Removing the reset code entirely and making scene/session be created in render() I am not sure if we can do this cleanly, as shaders/objects get created before this call and those need the session pointer to access things like the shader manager as one example. The other alternative I can think of is to modify the Gaffer renderer interface to have some kind of "re-request everything" mode that we can call from render() or we also defer shader/object creation all to render() somehow.

I think this would be a good one to land in soon as it solves #5234 and deadlocks on windows (can't verify that without some help). To get this over the line I think I'll need some Windows assistance from @ericmehl or whoever wants to see if those deadlocks still exist with this patch.

Cheers

@ericmehl
Copy link
Collaborator

Hey @boberfly,
Sorry for letting this fall through the cracks. I tested it on Windows here and it seems to be working great. I was getting the deadlock about 1 in 2 or 3 sessions before, and I just did 10 sessions in a row and didn't get any deadlocks.

🪟 🪟 = 👍 !

- Refactor the problematic reset code as we now only allow scene/session parameters to be set before rendering starts.
- This will also avoid a mutex lock issue with Windows.
- Remove render states and just have an m_rendering bool
- Removed `removeOSLShaders()`, Cycles seems more hardy at not crashing when this happens now
@boberfly boberfly force-pushed the fixes/cyclesParamsBeforeRender branch from a984209 to 1b2f51d Compare July 1, 2023 08:11
@boberfly
Copy link
Collaborator Author

boberfly commented Jul 1, 2023

While working on some production assets, I've been putting Cycles through its paces and passed it through a debugger to discover that the reset code also needs to change ownership of all the objects to the newly created scene to prevent assertions when freeing them later.

I've sometimes got it to crash in Embree+TBB with an invalid pointer, but that one is hard to narrow down, as well as a crash in OIIO. I'll keep at it...

@boberfly boberfly mentioned this pull request Jul 2, 2023
4 tasks
@johnhaddon johnhaddon self-assigned this Oct 20, 2023
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 26, 2024
As the new test (courtesy of Alex via GafferHQ#5101) shows, we don't need to delete OSL shaders when in SVM mode to avoid Cycles crashes (but perhaps this was not always the case?). This avoids a linear-in-the-number-of-shaders check on every render, and simplifies some logic. The outcome is now no longer dependent on whether the SVM/OSL mismatch occurs before or after rendering has started : in both cases we now just give Cycles what the user asked for, and let it ignore shaders it can't handle.

We could really do with a warning when there are OSL shaders in an SVM render, but that will be easier to achieve without expensive checks after a bit more refactoring. Look out for it in an upcoming commit.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 26, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect they were also the cause of the crash exposed by `InteractiveCyclesRenderTest.testShadingModes` - with the reset code removed, the test now passes.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 26, 2024
As the new test (courtesy of Alex via GafferHQ#5101) shows, we don't need to delete OSL shaders when in SVM mode to avoid Cycles crashes (but perhaps this was not always the case?). This avoids a linear-in-the-number-of-shaders check on every render, and simplifies some logic. The outcome is now no longer dependent on whether the SVM/OSL mismatch occurs before or after rendering has started : in both cases we now just give Cycles what the user asked for, and let it ignore shaders it can't handle.

We could really do with a warning when there are OSL shaders in an SVM render, but that will be easier to achieve without expensive checks after a bit more refactoring. Look out for it in an upcoming commit.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 26, 2024
As the new test (courtesy of Alex via GafferHQ#5101) shows, we don't need to delete OSL shaders when in SVM mode to avoid Cycles crashes (but perhaps this was not always the case?). This avoids a linear-in-the-number-of-shaders check on every render, and simplifies some logic. The outcome is now no longer dependent on whether the SVM/OSL mismatch occurs before or after rendering has started : in both cases we now just give Cycles what the user asked for, and let it ignore shaders it can't handle.

We could really do with a warning when there are OSL shaders in an SVM render, but that will be easier to achieve without expensive checks after a bit more refactoring. Look out for it in an upcoming commit.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 28, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect they were also the cause of the crash exposed by `InteractiveCyclesRenderTest.testShadingModes` - with the reset code removed, the test now passes.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Feb 29, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect they were also the cause of the crash exposed by `InteractiveCyclesRenderTest.testShadingModes` - with the reset code removed, the test now passes.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Mar 1, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect this was also the cause of the crash exposed by `InteractiveCyclesRenderTest.testSVMRenderWithCPU`.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Mar 4, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect this was also the cause of the crash exposed by `InteractiveCyclesRenderTest.testSVMRenderWithCPU`.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Mar 7, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect this was also the cause of the crash exposed by `InteractiveCyclesRenderTest.testSVMRenderWithCPU`.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
johnhaddon added a commit to johnhaddon/gaffer that referenced this pull request Mar 7, 2024
This was an attempt to deal with a fundamental mismatch between the Cycles and Renderer APIs : you can't create a `ccl::session` without already having `ccl::SessionParams` (threads, shading system, device etc), but the values for those will arrive via `Renderer::option()` _after_ construction of the renderer. And unfortunately the Renderer constructor is the most natural place  to construct the `ccl::Session`.

Our strategy was to create a default session in the renderer constructor, and then store options into `m_sessionParams`, and then in `reset()` attempt to replace the original session with a new one constructed from `m_sessionParams`. This was made tricky by the fact that the session also owns the scene, and we needed to transplant all the old scene objects into the new scene. This almost worked, but had several flaws :

- On Windows, we were getting deadlocks due to deleting the scene lock while holding it. This could have been fixed separately while keeping the reset concept alive - see GafferHQ#5101.
- The Cycles objects we transplanted had pointers to the old scene, and this would trigger assertions when freeing them in debug builds. I suspect this was also the cause of the crash exposed by `InteractiveCyclesRenderTest.testSVMRenderWithCPU`.
- More generally, it was working against the Cycles API, and could be broken in any number of ways by future Cycles changes.

Just removing the `reset()` code isn't the end of the story though : it means that the options for devices, threads etc are now completely non-functioning.  This results in test failures for `RendererTest.testCustomAOV` and `RendererTest.testOSLInSVMShadingSystem`. We'll deal with that in future commits : the purpose of this commit is to document why the old approach wasn't working, and clear the decks a little in preparation for a new approach.
@johnhaddon
Copy link
Member

Closing, since this should now be covered by the merge of #5711.

@johnhaddon johnhaddon closed this Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants