Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster unique rows #297

Closed
wants to merge 4 commits into from
Closed

Faster unique rows #297

wants to merge 4 commits into from

Conversation

Huite
Copy link
Collaborator

@Huite Huite commented Sep 4, 2024

Fixes #46

@Huite
Copy link
Collaborator Author

Huite commented Sep 4, 2024

@veenstrajelmer

  • Replaced all np.unique(..., axis=0) calls with a faster custom implementation.
  • Rationale: unique relies on sorting; our implementation minimizes sorting operations.
  • The order of generated connectivities (e.g., edge_node_connectivity) will differ from previous versions.
  • When comparing datasets generated before and after this change, use reindex_like to ensure consistent ordering.
  • Xugrid will still respect existing edge_node_connectivities, this PR doesn't change that behavior
  • Expected speed improvements, among things merging the different topologies of the partitions: would be interesting to test

Copy link
Collaborator

@veenstrajelmer veenstrajelmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this a few times with several datasets, keep in mind that for the second DCSM model caching is relevant so these timings are not from the first process. Caching is somehow maintained over multiple processes.

With xugrid 0.12.0:

DCSM nose
>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 18.80 sec
>> xu.merge_partitions() with 5 partition(s): 15.91 sec
>> dfmt.open_partitioned_dataset() total: 34.71 sec
DCSM
>> xu.open_dataset() with 20 partition(s): 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 : 5.70 sec
>> xu.merge_partitions() with 20 partition(s): 12.30 sec
>> dfmt.open_partitioned_dataset() total: 18.08 sec
Westerschelde
>> xu.open_dataset() with 1 partition(s): 1 : 0.09 sec

Memory usage:
image

With the faster-unique-rows branch:

DCSM nose
>> xu.open_dataset() with 5 partition(s): 1 2 3 4 5 : 18.61 sec
>> xu.merge_partitions() with 5 partition(s): 15.65 sec
>> dfmt.open_partitioned_dataset() total: 34.27 sec
DCSM
>> xu.open_dataset() with 20 partition(s): 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 : 5.78 sec
>> xu.merge_partitions() with 20 partition(s): 12.35 sec
>> dfmt.open_partitioned_dataset() total: 18.21 sec
Westerschelde
>> xu.open_dataset() with 1 partition(s): 1 : 0.09 sec

Memory usage:
image

Unfortunately, in all these testcases, there is no noticeable difference.

@Huite
Copy link
Collaborator Author

Huite commented Sep 5, 2024

Based on these results: the sorting isn't taking enough time to make this worthwhile.
(I could've known this beforehand if I had profiled properly, but it's nice to have realistic tests anyway!)

@Huite Huite closed this Sep 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Benchmark unique
2 participants