Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Instancer : Don't output separate capsules for prototypes #6052

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

danieldresser-ie
Copy link
Contributor

@danieldresser-ie danieldresser-ie commented Sep 20, 2024

As discussed, output one big Capsule from Instancer when encapsulateInstanceGroups is checked.

All seems to be working ... the Instancer code is pretty complex, but the test coverage is pretty decent.

The main goal is to avoid slowdown in Arnold because Arnold doesn't like BVH's for overlapping procedurals, but this does also result in a 20% reduction in the time spent in Gaffer while rendering an encapsulated Instancer. testEncapsulatedRenderPerf only shows an 8% reduction, but that's because the time spent in CapturingRender dominates that test ( in a 1.06 second test, about 0.7 seconds is in CapturingRender ). I was trying hard not to spend any time on that, so I just did a quick test to make sure I understood what's happening by ripping out all it's functionality for a temporary test. I haven't looked into why it's slow, but I suspect tbb::concurrent_hash_map might be the biggest problem.

Outputting names to the renderer including the prototype name prefix turned out not to be a performance problem at all. I'm currently keeping the property of the old code where it doesn't do any string allocations for each instance if the prototype name is the same ... I couldn't actually measure any slowdown if I did some slightly simpler code that allocated new strings for each instance, but this code is only 12 lines, so it didn't seem like a problem to keep it.

Some of the performance results look a bit unreasonable after this PR: testEngineDataPerf and testEngineDataPerfWithPrototypes show 99%/100% reductions. But it is actually true that in these cases, computing the EngineData doesn't require looping over each instance. Since the tests only evaluate childNames( ".../instances" ), now that the data necessary for splitting prototypes is stored on a separate plug, we can compute the names of the prototypes without having to process the instances, so there is actually a 99% improvement in this case. Maybe I should add more perf tests to illustrate this more clearly? But there are already a good number of performance tests that show different aspects of this, like testEngineDataPerfWithPrototypesAndIds, which is able to skip the prototype splitting calculation, but needs to process all the instances for a different reason to make sure the ids are valid, and only shows a 16% improvement, or testChildNamesPerf, which looks at the child names of one of the prototypes, so does require the prototype splitting data, and doesn't improve.

Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround on this Daniel! LGTM, although I did note a few minor things inline (and we definitely should rename the plug).

Now we've gone this far, I'm really liking the idea of simplifying the hierarchy for the non-encapsulated case too. I've opened negotiations with folks at Cinesite about that, and I think there's a decent chance they'll go for it. We should do the same with Image Engine, and of course anyone reading along is welcome to share their opinion too.

I couldn't actually measure any slowdown if I did some slightly simpler code that allocated new strings for each instance, but this code is only 12 lines, so it didn't seem like a problem to keep it.

I think I feel a bit differently about this - I think all complexity needs to justify itself in some way. Otherwise it's just sucking attention away from things that actually do matter, and providing one more place for bugs to hide. I'm hopeful that we'll be able to go all the way in terms of simplifying the hierarchy though, in which case a zero-allocation approach to making the purely numeric names we need will be nice and easy. So let's leave this as it is while we decide on that.

Cheers...
John

Changes.md Outdated
@@ -123,6 +123,7 @@ Breaking Changes
- ImageGadget : Remove non-const variant of `getContext()`.
- LazyMethod : `deferUntilPlaybackStops` now requires that the Widget has a `scriptNode()` method rather than a `context()` method.
- Python : Gaffer now disables the user site-packages directory by setting `PYTHONNOUSERSITE=1`. To revert to the previous behaviour, set `PYTHONNOUSERSITE=0` before launching Gaffer.
- Instancer : `encapsulateInstanceGroups` now produces a single Capsule at the `.../instances` location, instead of producing separate capsules at each `.../instances/<prototypeName>` location. Outputting a single capsule is slightly less flexible for users, but is much faster to render in Arnold ( Arnold is currently unable to deal effectively with large numbers of fully overlapping procedurals ). This change also allowed us to simplify the code slightly, reducing the time spent in Gaffer when rendering an encapsulated Instancer by around 20%.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should rename encapsulateInstanceGroups to just encapsulate now, since we're no longer encapsulating instance groups. Let's do that as a separate commit, with a compatibility config to keep old scenes loading.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will also need moving up to a new section, since we've released 1.5.0.0a1 now.

@@ -395,7 +395,7 @@ class Instancer::EngineData : public Data
}
}

bool hasDuplicates = false;
m_hasDuplicates = false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to move this to the initialiser section, so it's always initialised? I suspect we don't ever use it when we returned early from the constructor, but it's one less thing to think about.

@@ -2331,9 +2402,9 @@ IECore::ConstInternedStringVectorDataPtr Instancer::computeBranchChildNames( con
{
// "/instances/<prototypeName>"

ConstEngineDataPtr engineData = engine( sourcePath, context );
ConstEngineSplitPrototypesDataPtr esp = engineSplitPrototypes( sourcePath, context );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't match hashBranchChildNames() any more, because that is only hashing enginePlug(). In practice I don't think that causes any problems, because engineSplitPrototypes is derived entirely from engine. But with possible future changes in mind, perhaps it's worth being a bit of a stickler here?


for( size_t idx = r.begin(); idx != r.end(); ++idx )
for( size_t i = r.begin(); i != r.end(); ++i )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe i could be called pointIndex, replacing the original variable? Then less code would change, and I think it'd be a bit easier to follow anyway...

Comment on lines 2903 to 3030
for( unsigned int i = 0; i < engines.size(); i++ )
for( unsigned int j = 0; j < engines.size(); j++ )
{
int curPointIndex = i == 0 ? pointIndex : engines[i]->pointIndex( instanceId );
pointTransforms[i] = proto->m_transforms[i] * engines[i]->instanceTransform( curPointIndex );
int curPointIndex = j == 0 ? i : engines[j]->pointIndex( instanceId );
pointTransforms[j] = proto->m_transforms[j] * engines[j]->instanceTransform( curPointIndex );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we called the previous variable pointIndex rather than i, could this all stay unchanged?

@danieldresser-ie danieldresser-ie force-pushed the instancerOneCap branch 2 times, most recently from 6db01f6 to ec82fd8 Compare September 23, 2024 22:48
@danieldresser-ie
Copy link
Contributor Author

Added a fix commit addressing feedback.

Question: should I add a compatibility config mapping from the old encapsulateInstanceGroups to the new encapsulate? They are kinda different, but it seems closer to preserving intent to transfer the old setting?

Other than that, the question is whether to completely remove the prototype groups - we'll continue that conversation offline.

@johnhaddon
Copy link
Member

should I add a compatibility config mapping from the old encapsulateInstanceGroups to the new encapsulate? They are kinda different, but it seems closer to preserving intent to transfer the old setting?

Yep, I was expecting that we'd transfer the old setting. To me the intent is primarily "optimise this", and that hasn't changed.

@danieldresser-ie
Copy link
Contributor Author

Added compatibility shim to fix commit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants