-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove redundant instructions from tile hash #3898
Remove redundant instructions from tile hash #3898
Conversation
Signed-off-by: Curtis Black <[email protected]>
@@ -590,6 +594,7 @@ struct TileID { | |||
int m_miplevel; ///< MIP-map level | |||
short m_chbegin, m_chend; ///< Channel range | |||
int m_colortransformid; ///< Colorspace id (0 == default) | |||
int m_padding = 0; ///< Unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be evaluated separately, but maybe instead of adding 32-bits of padding, m_miplevel
and m_colortransformid
could maybe be changed to 16-bit to shave off an extra 8 bytes?
I'd be at least curious to know if that saves any measurable amount of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I didn't necessarily wish to bundle that type of change into this PR but I also thought something similar.
Unfortunately I don't have a build with all the recent color space additions to test performance. The performance was measured on v2.4.6, prior to the addition of m_colortransformid
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can save this for another pass, but it seems like a good idea to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great idea.
I will merge when CI finishes.
@curtisblack Please look at the failed CI test, which implicates this code. The one test that fails is when using gcc12, presumably triggering a new warning that didn't exist in older gcc versions. It's behaving as if there is an uninitialized field in the TileID struct, but looking at your code I don't see why that would be the case. Can you please take a close look and see if you can convince yourself whether the warning is correct (and fix) or if it's some kind of false positive (in which case you should probably use some kind of pragma to locally disable it where it is causing trouble, so that we can continue to pass CI). |
Thanks for checking @lgritz, as far as I can tell it does appear to be a false positive that only seems to affect GCC. Clang doesn't see this as a warning or error. I will look into locally disabling this specific warning for that part of the code. See https://godbolt.org/z/5q7Y7ndfb for a comparison with clang. It might also be showing in that CI build because it enables I did some additional testing, using the |
Signed-off-by: Curtis Black <[email protected]>
src/libtexture/imagecache_pvt.h
Outdated
// This appears to be a false positive which only affects GCC. | ||
// https://godbolt.org/z/5q7Y7ndfb | ||
OIIO_PRAGMA_WARNING_PUSH | ||
OIIO_GCC_ONLY_PRAGMA(GCC diagnostic ignored "-Wuninitialized") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still failing. Based on the error message, I think this might need to be -Wmaybe-uninitialized
?
Signed-off-by: Curtis Black <[email protected]>
Signed-off-by: Larry Gritz <[email protected]>
I think I have the right fix, @curtisblack -- the pragma needs to be in hash.h, immediately surrounding the line where it thinks there is a problem (and the other spots were not needed, and I think also were turning off the wrong warning). I've taken the liberty of pushing my fix on top of yours. We'll see, if it passes CI then I'll merge the combined patch. Cross fingers... |
d757bff
into
AcademySoftwareFoundation:master
Relevant tests passed, warning squashed, so I merged! |
…4006) A static_assert was added recently as part of PR #3898, which assures that the size of a TileID is a multiple of 8 bytes. But on architectures with 32 bit pointers, this assertion fails. Which is not quite what we want, because the code isn't wrong if the struct isn't a multiple of 8 bytes; it's just less efficient. And we don't want to fail for 32 bit arch. Bottom line: only do this check to ensure we're in the most efficient case if we're 64 bits. Nobody's expecting highest performance on a 32 bit machine anyway. Fixes #4001 Signed-off-by: Larry Gritz <[email protected]>
…cademySoftwareFoundation#4006) A static_assert was added recently as part of PR AcademySoftwareFoundation#3898, which assures that the size of a TileID is a multiple of 8 bytes. But on architectures with 32 bit pointers, this assertion fails. Which is not quite what we want, because the code isn't wrong if the struct isn't a multiple of 8 bytes; it's just less efficient. And we don't want to fail for 32 bit arch. Bottom line: only do this check to ensure we're in the most efficient case if we're 64 bits. Nobody's expecting highest performance on a 32 bit machine anyway. Fixes AcademySoftwareFoundation#4001 Signed-off-by: Larry Gritz <[email protected]>
Description
Optimises the code in the tile hash function. During a render this function is called billions of times so this small improvement adds up, in some test scenes it reduces total render time by an average of about 2%.
See https://godbolt.org/z/jMzz4PTce for the exact details. A summary would be:
ustring::hash()
.There is a slight change in behaviour, which is that the hash is now computed from the
m_file
pointer rather than fromm_file->filename().hash()
, so runtime reproducibility is lost. However the performance gain seems worth it.Tests
Checklist:
have previously submitted a Contributor License Agreement
(individual, and if there is any way my
employers might think my programming belongs to them, then also
corporate).
(adding new test cases if necessary).