Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MergeObjects #5984

Merged
merged 10 commits into from
Sep 17, 2024
Merged

MergeObjects #5984

merged 10 commits into from
Sep 17, 2024

Conversation

danieldresser-ie
Copy link
Contributor

OK, this should be in a pretty decent place to review ... all the weird corner case behaviours seem to be working OK.

There are some odds-n-ends commits here, from back when I was hoping this could share more functionality with BranchCreator and I was looking at BranchCreator closely.

Oh, and a TODO on the weird thing on SceneNode where it uses h as the initial value, but then also appends onto h afterwards, I should clean that up.

Anyway, we can't merge this yet because we don't have a tolerable implementation of merge merging yet - the only thing I want you to look at is MergeObjects, and I think that's ready.

Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel!

It's going to take me a long time to absorb all this - as you said, it's not simple! Just posting a few preliminary comments so you have a couple of things to consider today. Hopefully I'll be able to get my head into it a bit further on Monday. In the meantime it probably makes sense to work on the merge algo...

Cheers...
John

include/GafferScene/Private/ChildNamesMap.h Outdated Show resolved Hide resolved
src/GafferScene/Constraint.cpp Show resolved Hide resolved
include/GafferScene/MergeObjects.h Outdated Show resolved Hide resolved
include/GafferScene/MergeObjects.h Outdated Show resolved Hide resolved
include/GafferScene/MergeObjects.h Outdated Show resolved Hide resolved
Comment on lines +956 to +1088
// To know if we need to check the child bounds, we need to know if there are any destinations below this
// location. We can efficiently query this from the mergeLocationData.
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand it, we're evaluating mergeLocationsPlug() at every location (not just here, elsewhere in the code too), and it has a unique hash at each location. I believe this differs from the BranchCreator strategy where we only evaluate mappingPlug() after using branchesPlug() to determine that there's something worth evaluating there. I wonder if this has performance implications...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it does have some sort of performance implications. In a simplistic view of read performance, grabbing mergeLocations plug from the cache is one map lookup ( of an admittedly large map ), whereas finding the location in branchesPlug() requires a map lookup for each level of the hierarchy, and then you'd have to do another plug evaluation of the filter to check if anything is pruned ( which is also unique at each location ). But there probably are cases where the hash pressure is a much bigger problem than doing some map lookups. Probably could be reasonable to add some more code paths to check the tree data and the filter, and skip access to mergeLocationPlug when not needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a simplistic view of read performance, grabbing mergeLocations plug from the cache is one map lookup

Not sure I've read this right, but it seems like you might be implying that most __mergeLocation evaluations are cache hits? But it looks like MergeLocationsData::hash() includes path in the hash, meaning that we're computing something at __mergeLocation every time, even when there's no merging going on at that location. That's what led me to think this might be a fair bit worse than doing an initial lookup on the TreeData (which is cached).

I suggest we park this temporarily while focussing on getting a final MeshAlgo::merge() implementation sorted. But I do think it'd be worth doing a bit of further investigation before we release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I probably should have documented my thoughts more before moving on, took me a bit to remember this, but anyway:

Yes, we hit MergeLocations for every location, including locations where no merging is going on. But the work performed in locations where there is no merging happening is the same work we would need to perform anyway in order to check if we should access MergeLocations.

So the trade-off is just whether we cache this work between different locations accessing MergeLocations - caching it is a minor speedup, but consumes more cache.

I've added two new tests in 5d5ccc3

testPerformanceUnprocessedLocations tests the case where we aren't processing most locations, and MergeLocationData is usually NullObject anyway, which gives us the best case of seeing a benefit from not bothering to access it.

testPerformanceDeepPaths tests the case where there are deep paths being accessed repeatedly - the children of a location will access MergeLocationData in order to check their parent, so there are lots of accesses to every location, and querying the tree is expensive, giving maximum benefit to caching MergeLocationData.

19f5fdb adds some extra code to perform the extra tree tree query and skip accessing MergeLocationsData whenever possible.

This causes testPerformanceDeepPaths to get 10% slower, and testPerformanceUnprocessedLocations to get 5% faster.

There is probably an argument though that testPerformanceUnprocessedLocations is more plausible to occur in production scenes, and while the performance benefit isn't huge, it does reduce cache usage by 50% in this extreme case - so maybe it's worth it?

If you want to go with 5d5ccc3, I should clean it up tomorrow. If you don't think it's needed, it would be fine to just drop the last commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't anticipated the need for the filter checks in accessMergeLocation() - with my naive BranchCreator-primed hat on, I think I assumed the tree had all the info we need. But it doesn't, because that only stores destinations, right, and things that need pruning aren't stored?

For what it's worth, I'm seeing a 10% improvement in testPerformanceUnprocessedLocations and no real change in testPerformanceDeepPaths here on my workstation (32 core Xeon). My general feeling is that the merging-a-small-number-of-locations-in-a-large-scene is most likely our most important use case, but we also have to balance that against the maintainability of the code. I'll leave it to your judgement...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative thought : is it possible to tighten up the hash for mergeLocationPlug(), so that we do fewer computes on that plug by omitting path whenever we can?

src/GafferScene/MergeMeshes.cpp Show resolved Hide resolved
src/GafferScene/MergeMeshes.cpp Outdated Show resolved Hide resolved
src/GafferScene/MergeMeshes.cpp Outdated Show resolved Hide resolved
src/GafferScene/MergeMeshes.cpp Outdated Show resolved Hide resolved
@johnhaddon
Copy link
Member

Just noting that #956 can be closed once this is merged.

@danieldresser-ie danieldresser-ie changed the base branch from main to 1.4_maintenance August 6, 2024 23:26
@danieldresser-ie
Copy link
Contributor Author

Addressed first round feedback. Was unsure whether to squash or not, but it all seemed straightforward enough that it made sense to squash.

@johnhaddon
Copy link
Member

Thanks for the update Daniel - I've been through and resolved or replied to the previous discussions. One thing I forgot to mention the first time around - I'd like to omit the BranchCreator commits from this PR. They're not really related, and I'm not sure how much benefit they really have - feel free to open a separate PR for them if you want though.

@danieldresser-ie
Copy link
Contributor Author

This is in a pretty sloppy state at the moment - it's rebased on top of #6026 which is full of patch commits, and I haven't yet trimmed out the BranchCreator commits.

But this should be good enough for doing interactive testing of #6026 as you requested.

@johnhaddon
Copy link
Member

Spent an hour or so messing about with this this afternoon, and it all worked very nicely. Every time I thought I'd found a bug it turned out just to be a side-effect of the zero-initialisation of values when one mesh doesn't have a primvar that another mesh does have. This happened for me both with Cs and N, in both cases making the mesh black in the viewport. I think this might warrant a note in the documentation for the node, and perhaps in the case of N, a warning at runtime?

@danieldresser-ie
Copy link
Contributor Author

Re-rebased onto #6026
Split off #6038 and #6039
Added note about missing normals to node doc.

@johnhaddon
Copy link
Member

Thanks Daniel - could you rebase this one more time please, now that #6026 is merged. We also have a parked conversation that is probably worth continuing before finally merging this.

@danieldresser-ie
Copy link
Contributor Author

While waiting for feedback on the optimization question, I've added MergePoints and MergeCurves.

Also added a82ce06 with special case handling of "type" on Curves, since it won't be picked up by the renderer backends if we promote it from Constant to Vertex.

Copy link
Member

@johnhaddon johnhaddon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Daniel! I've replied to our previous conversation about performance, noted a TODO from last time, and added a couple of comments for the latest commits. Hopefully we're pretty close now.
Cheers...
John

startup/gui/menus.py Outdated Show resolved Hide resolved
src/GafferScene/MergePoints.cpp Outdated Show resolved Hide resolved
Comment on lines +956 to +1088
// To know if we need to check the child bounds, we need to know if there are any destinations below this
// location. We can efficiently query this from the mergeLocationData.
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hadn't anticipated the need for the filter checks in accessMergeLocation() - with my naive BranchCreator-primed hat on, I think I assumed the tree had all the info we need. But it doesn't, because that only stores destinations, right, and things that need pruning aren't stored?

For what it's worth, I'm seeing a 10% improvement in testPerformanceUnprocessedLocations and no real change in testPerformanceDeepPaths here on my workstation (32 core Xeon). My general feeling is that the merging-a-small-number-of-locations-in-a-large-scene is most likely our most important use case, but we also have to balance that against the maintainability of the code. I'll leave it to your judgement...

Comment on lines +956 to +1088
// To know if we need to check the child bounds, we need to know if there are any destinations below this
// location. We can efficiently query this from the mergeLocationData.
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternative thought : is it possible to tighten up the hash for mergeLocationPlug(), so that we do fewer computes on that plug by omitting path whenever we can?

Not sure if there's any reason this wasn't added before, but it seems to work fine from Python
@danieldresser-ie
Copy link
Contributor Author

I've removed the outdated TODO, and gone with your suggestion to spend the upfront evaluation of __tree in order to give the more precise hash for __mergeLocation. The hash for objectPlug was evaluating __tree anyway, and going with this approach is simpler, and gives the performance benefit of the previous optimization commit, while avoiding the downside in testPerformanceDeepPaths.

If this looks good, it just needs a squash before merging.

@johnhaddon
Copy link
Member

Thanks Daniel! Squashed and merging...

@johnhaddon johnhaddon merged commit 0ea3e19 into GafferHQ:main Sep 17, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants