-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MergeObjects #5984
MergeObjects #5984
Conversation
ecc77e0
to
a824933
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Daniel!
It's going to take me a long time to absorb all this - as you said, it's not simple! Just posting a few preliminary comments so you have a couple of things to consider today. Hopefully I'll be able to get my head into it a bit further on Monday. In the meantime it probably makes sense to work on the merge algo...
Cheers...
John
// To know if we need to check the child bounds, we need to know if there are any destinations below this | ||
// location. We can efficiently query this from the mergeLocationData. | ||
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, we're evaluating mergeLocationsPlug()
at every location (not just here, elsewhere in the code too), and it has a unique hash at each location. I believe this differs from the BranchCreator strategy where we only evaluate mappingPlug()
after using branchesPlug()
to determine that there's something worth evaluating there. I wonder if this has performance implications...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure it does have some sort of performance implications. In a simplistic view of read performance, grabbing mergeLocations plug from the cache is one map lookup ( of an admittedly large map ), whereas finding the location in branchesPlug() requires a map lookup for each level of the hierarchy, and then you'd have to do another plug evaluation of the filter to check if anything is pruned ( which is also unique at each location ). But there probably are cases where the hash pressure is a much bigger problem than doing some map lookups. Probably could be reasonable to add some more code paths to check the tree data and the filter, and skip access to mergeLocationPlug when not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a simplistic view of read performance, grabbing mergeLocations plug from the cache is one map lookup
Not sure I've read this right, but it seems like you might be implying that most __mergeLocation
evaluations are cache hits? But it looks like MergeLocationsData::hash()
includes path
in the hash, meaning that we're computing something at __mergeLocation
every time, even when there's no merging going on at that location. That's what led me to think this might be a fair bit worse than doing an initial lookup on the TreeData (which is cached).
I suggest we park this temporarily while focussing on getting a final MeshAlgo::merge()
implementation sorted. But I do think it'd be worth doing a bit of further investigation before we release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should have documented my thoughts more before moving on, took me a bit to remember this, but anyway:
Yes, we hit MergeLocations for every location, including locations where no merging is going on. But the work performed in locations where there is no merging happening is the same work we would need to perform anyway in order to check if we should access MergeLocations.
So the trade-off is just whether we cache this work between different locations accessing MergeLocations - caching it is a minor speedup, but consumes more cache.
I've added two new tests in 5d5ccc3
testPerformanceUnprocessedLocations
tests the case where we aren't processing most locations, and MergeLocationData is usually NullObject anyway, which gives us the best case of seeing a benefit from not bothering to access it.
testPerformanceDeepPaths
tests the case where there are deep paths being accessed repeatedly - the children of a location will access MergeLocationData in order to check their parent, so there are lots of accesses to every location, and querying the tree is expensive, giving maximum benefit to caching MergeLocationData.
19f5fdb adds some extra code to perform the extra tree tree query and skip accessing MergeLocationsData whenever possible.
This causes testPerformanceDeepPaths to get 10% slower, and testPerformanceUnprocessedLocations to get 5% faster.
There is probably an argument though that testPerformanceUnprocessedLocations is more plausible to occur in production scenes, and while the performance benefit isn't huge, it does reduce cache usage by 50% in this extreme case - so maybe it's worth it?
If you want to go with 5d5ccc3, I should clean it up tomorrow. If you don't think it's needed, it would be fine to just drop the last commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't anticipated the need for the filter checks in accessMergeLocation()
- with my naive BranchCreator-primed hat on, I think I assumed the tree had all the info we need. But it doesn't, because that only stores destinations, right, and things that need pruning aren't stored?
For what it's worth, I'm seeing a 10% improvement in testPerformanceUnprocessedLocations
and no real change in testPerformanceDeepPaths
here on my workstation (32 core Xeon). My general feeling is that the merging-a-small-number-of-locations-in-a-large-scene is most likely our most important use case, but we also have to balance that against the maintainability of the code. I'll leave it to your judgement...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative thought : is it possible to tighten up the hash for mergeLocationPlug()
, so that we do fewer computes on that plug by omitting path
whenever we can?
Just noting that #956 can be closed once this is merged. |
a824933
to
7eb9c62
Compare
Addressed first round feedback. Was unsure whether to squash or not, but it all seemed straightforward enough that it made sense to squash. |
7eb9c62
to
2a60b22
Compare
Thanks for the update Daniel - I've been through and resolved or replied to the previous discussions. One thing I forgot to mention the first time around - I'd like to omit the BranchCreator commits from this PR. They're not really related, and I'm not sure how much benefit they really have - feel free to open a separate PR for them if you want though. |
2a60b22
to
2f091b7
Compare
Spent an hour or so messing about with this this afternoon, and it all worked very nicely. Every time I thought I'd found a bug it turned out just to be a side-effect of the zero-initialisation of values when one mesh doesn't have a primvar that another mesh does have. This happened for me both with |
2f091b7
to
f804524
Compare
f804524
to
39ed3d9
Compare
Thanks Daniel - could you rebase this one more time please, now that #6026 is merged. We also have a parked conversation that is probably worth continuing before finally merging this. |
3cb7281
to
19f5fdb
Compare
While waiting for feedback on the optimization question, I've added MergePoints and MergeCurves. Also added a82ce06 with special case handling of "type" on Curves, since it won't be picked up by the renderer backends if we promote it from Constant to Vertex. |
fe42693
to
d45bcaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Daniel! I've replied to our previous conversation about performance, noted a TODO from last time, and added a couple of comments for the latest commits. Hopefully we're pretty close now.
Cheers...
John
// To know if we need to check the child bounds, we need to know if there are any destinations below this | ||
// location. We can efficiently query this from the mergeLocationData. | ||
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hadn't anticipated the need for the filter checks in accessMergeLocation()
- with my naive BranchCreator-primed hat on, I think I assumed the tree had all the info we need. But it doesn't, because that only stores destinations, right, and things that need pruning aren't stored?
For what it's worth, I'm seeing a 10% improvement in testPerformanceUnprocessedLocations
and no real change in testPerformanceDeepPaths
here on my workstation (32 core Xeon). My general feeling is that the merging-a-small-number-of-locations-in-a-large-scene is most likely our most important use case, but we also have to balance that against the maintainability of the code. I'll leave it to your judgement...
// To know if we need to check the child bounds, we need to know if there are any destinations below this | ||
// location. We can efficiently query this from the mergeLocationData. | ||
ConstMergeLocationDataPtr mergeLocationData = IECore::runTimeCast< const MergeLocationData >( mergeLocationPlug()->getValue() ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative thought : is it possible to tighten up the hash for mergeLocationPlug()
, so that we do fewer computes on that plug by omitting path
whenever we can?
d45bcaa
to
c52646c
Compare
Not sure if there's any reason this wasn't added before, but it seems to work fine from Python
c52646c
to
d2bdc6d
Compare
I've removed the outdated TODO, and gone with your suggestion to spend the upfront evaluation of If this looks good, it just needs a squash before merging. |
d2bdc6d
to
cfe7ec7
Compare
Thanks Daniel! Squashed and merging... |
OK, this should be in a pretty decent place to review ... all the weird corner case behaviours seem to be working OK.
There are some odds-n-ends commits here, from back when I was hoping this could share more functionality with BranchCreator and I was looking at BranchCreator closely.
Oh, and a TODO on the weird thing on SceneNode where it uses
h
as the initial value, but then also appends onto h afterwards, I should clean that up.Anyway, we can't merge this yet because we don't have a tolerable implementation of merge merging yet - the only thing I want you to look at is MergeObjects, and I think that's ready.