-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make Eth2Digest isZero 8x faster #6682
Merged
Changes from all commits
Commits
Show all changes
3 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since it's a weekend, there's one thing that one might try, which is to get rid of the branch entirely:
it could pay off because mdigest is 32 bytes which fits in a cache line so it won't cost anything more to read the rest of it and the resulting code should be much smaller but who knows these days..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
on the other hand, if it's nonzero, it's likely to be so in the first word too - choices, choices... it's certainly faster when it's zero, so it might be that it depends on the frequency of zero-mem digests, which one ends up a "total winner" :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, it's comparable, maybe slightly faster. It's easy enough to try both, but the advantage of the one with branches is that it can early-exit, so these benchmarks are best case:
Would have to add non-zero benchmarks to compare properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried all 4 combinations,
staticFor
+copyMem
); andstaticFor
/copyMem
/or
loop with no early exitand zero/non-zero hashes (set them to all 255)
And:
benchmarks_nonzerohash_copymem.txt
benchmarks_nonzerohash_copymem_or.txt
benchmarks_zerohash_copymem.txt
benchmarks_zerohash_copymem_or.txt
Remarkably similar to each other. The main advantage seems to be the efficient static iteration over fewer, large elements, not the exact operations or even the early exit. It matters, but it's fairly marginal performance-wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
or
approach isn't much better, but sure, slightly better, so going with it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One quirk of this version is that it generalizes less outside the
Eth2Digest
and/or hash function case, where there are only 4 chunks to process, so the early exit doesn't buy much.For more general
isZeroMemory
, early exit, unless it had to be constant-time, might become more generally useful.