-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add API for building interpolated getter free functions. #1765
Conversation
Signed-off-by: uedaki <[email protected]>
@@ -320,9 +320,9 @@ BackendLLVM::llvm_type_groupdata() | |||
TypeDesc* types = &group().m_userdata_types[0]; | |||
int* offsets = &group().m_userdata_offsets[0]; | |||
int sz = (nuserdata + 3) & (~3); | |||
fields.push_back(ll.type_array(ll.type_bool(), sz)); | |||
fields.push_back(ll.type_array(ll.type_int8(), sz)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you talk to changing the type of the
bool GroupData::userdata_init_flags;
to
int8 GroupData::userdata_init_flags;
?
Is it to ensure the size of the GroupData structure is consistent across platforms?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
userdata_init_flags
is a variable used as a char
in osl_bind_interpolated_param
which can take 3 different values: not initialized (0), not found (1), and initialized (2).
Boolean values are treated as 2-state variables with llvm operators. Due to this limitation, it was impossible to implement the osl_bind_interpolated_param
functionality in build_interpolated_getter
with llvm operators. Cases 1 and 2 were treated the same way with the bool array.
To solve this problem, I changed the array type from bool to int8. This solution doesn't change the logic, since userdata_init_flags
was already used as a char
array.
That way I could implement osl_bind_interpolated_param
with llvm operators for the build_interpolated_getter
functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Uedaki , excellent explanation. Can you add a comment that it is representing 3 states and list them.
Follow on, should it just go to a vanilla int vs int8? Do we gain much by using 1 byte vs 4 bytes, not sure if alignment rules would cause the following variable to start on 4 byte boundary anyway.
std::vector<llvm::Value*> args; | ||
args.reserve(spec.arg_count() + 1); | ||
for (size_t index = 0; index < spec.arg_count(); index++) { | ||
const auto& arg = spec.arg(index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though the code flow says it, maybe add a comment here explaining that the argument could be dynamic value identified by an enum or constant value that can directly be emitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather keep the virtual functions in a good logical order when reading the header. Appending to the end is a bit of a suspect way to avoiding ABI break, I don't think it's guaranteed.
We're allowed to break ABI compatibility in main. We just need to adjust the version numbers to reflect it. Which we can do in a subsequent commit (we already need a subsequent commit in order to update the reference commit for ABI compatibility, so this is not a problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I put this comment in the wrong place!
} | ||
} | ||
args.push_back( | ||
ll.void_ptr(groupdata_field_ptr(2 + userdata_index))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comment identifying which field your are extracting
src/testshade/simplerend.cpp
Outdated
spec.set(rs_get_interpolated_test); | ||
} else { | ||
spec.set(rs_get_interpolated_generic, | ||
InterpolatedSpecBuiltinArg::OpaqueExecutionContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we always know param_name, type, and derivatives at compile time, do we need a fallback at all?
Either we should be able to emit a specific call or not?
src/testshade/rs_simplerend.cpp
Outdated
const OSL::TypeDesc type = OSL::TypeDesc_from(_type); | ||
|
||
OSL::ShaderGlobals* sg = (OSL::ShaderGlobals*)sg_; | ||
return sg->renderer->get_userdata(derivatives, name, type, sg, val); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be device compatible, we should break encapsulation of the first parameter OSL::OpaqueExecContextPtr
by casting it to ShaderGlobals
. In particular we can't make calls to a RendererServices
virtual function on the device side.
So what ever functionality you want from sg->renderer->get_userdata
should be placed right here with any required supporting data placed inside struct RenderState
located testshade/render_state.h
The idea being if you are using free functions version, you wouldn't even need to implement RendererServices::get_userdata, because rs_get_interpolated* has replaced it.
src/testshade/rs_simplerend.cpp
Outdated
} | ||
|
||
OSL_RSOP bool | ||
rs_get_interpolated_generic(void* sg_, OSL::ustringhash_pod _name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void* sg_
should be
OSL::OpaqueExecContextPtr ec
@Uedaki , overall this looks great! There are just some encapsulation/device compatibility issues in the testshade rs_get_interpolated_* functions to clear up and a few more comments. |
@lgritz , can you comment on the CI /VFX2023 abi check failing? Is that because a new virtual function was added to RendererServices? Do we need to bump a version number or something to indicate the ABI might be changing? |
From the CI build artifacts: rendererservices.hnamespace OSL_v1_14_0 [−] class RendererServices 1
| Change | Effect
-- | -- | --
Virtual method build_interpolated_getter ( ShaderGroup const&, OpenImageIO_v2_4::ustring const&, struct OpenImageIO_v2_4::TypeDesc, bool, InterpolatedGetterSpec& ) has been added to this class. | The layout of v-table has been changed. Call of any virtual method at higher position in this class or its subclasses may result in crash or incorrect behavior of applications.
[+] show v-table (old and new) So perhaps the new virtual method needs to be moved after all the other virtual methods, try that. Then on ABI breaking version changes, we could rearrange. Add comment why it's at the end. |
I'd rather keep the virtual functions in a good logical order when reading the header. Appending to the end is a bit of a suspect way to avoiding ABI break, I don't think it's guaranteed. We're allowed to break ABI compatibility in main. We just need to adjust the version numbers to reflect it. Which we can do in a subsequent commit (we already need a subsequent commit in order to update the reference commit for ABI compatibility, so this is not a problem). Unless you need this backported to 1.13, in which case we should do this (even if not guaranteed) to try to preserve ABI compatibility in practice as best we can. |
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
Signed-off-by: uedaki <[email protected]>
@lgritz (asking as I think I might be missing something here) Why wouldn't we just require this PR update the version number, as it will need to to pass CI before we merge right? |
Bumping the version is what we do when we have to break the ABI, yes! But that alone won't make it pass the automated test that compares the ABI to a designated prior commit (named by its hash) that is the "reference" ABI we are trying to match. So the sequence we follow is:
You can't combine these into just one step, because the actual merge of PR (1) will result in a different commit hash than when the PR was submitted. There's no way to know what the new ABI reference commit's hash will be post-merge, so you simply cannot include it in step 1. If you can think of a better solution to this, I'm all ears. I haven't been able to think of an easy way to say "make THIS commit, whatever its hash may end up being, be the new ABI reference going forward." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, real test will be enabling in production renderer to see if all scenarios where covered.
7316669
into
AcademySoftwareFoundation:main
This won't be picked up by the platforms yet, they are still using 1.13.5.x for 2023.10 (as well as still OIIO 2.6.0.x, which is too old for this OSL release). So really the only client for this build is likely Arnold. Highlights: * Raise OpenImageIO to 2.6.1.0. * RS::get_texture_info make robust to empty shaderglobals param (AcademySoftwareFoundation#1731) * LLVM opaque pointers support (AcademySoftwareFoundation#1728) * LLVM new pass manager support (AcademySoftwareFoundation#1729) * LLVM 16 and 17 support (AcademySoftwareFoundation#1730) * fix: SymOverrideInfo bitfields should be the same type (AcademySoftwareFoundation#1745) * fix: batched pointcloud if "index" is not passed (AcademySoftwareFoundation#1753) * api: Ustringhash phase 3 (AcademySoftwareFoundation#1732) * fix: Fix NVPTX TargetMachine leak, etc. (AcademySoftwareFoundation#1763) * feat(api): Add API for building interpolated getter free functions. (AcademySoftwareFoundation#1765) * build: add options to use static Cuda libraries (AcademySoftwareFoundation#1772) * build(deps): Initial support for LLVM-18 (AcademySoftwareFoundation#1773) * fix: Mute partio error prints (AcademySoftwareFoundation#1774) * fix: calculatenormal needs fliphandedness (AcademySoftwareFoundation#1783) * fix: Print closure missing error message at compile time instead of run time. (AcademySoftwareFoundation#1781) * fix: GPU interpolated param initialization (AcademySoftwareFoundation#1791) * oslquery: Simplify include and link needs (AcademySoftwareFoundation#1787) * Make isconnected() work with downstream renderer "connections." (AcademySoftwareFoundation#1782) Also along for the ride: * Remove the `blah-platform/=1.0` preamble, no longer needed. * Stop building for deprecated Nuke 12 and 13 platforms. * Raise cmake build version to 3.24 See merge request spi/dev/3rd-party/osl-feedstock!57
Description
Adds a new
RendererServices
APIbuild_interpolated_getter
which allows for building custom free functions that provide values for interpolated datas.The new API is involved at material compilation time and allow the developers to provide specialized functions leveraging the information known at compile time. It will allow for many optimization opportunities for the compiler by replacing runtime branch with direct memory reads.
This PR is a followup to the API build_attribute_getter proposed in this PR: #1704
Tests
TestShade provides an example implementation showing how this compile time information can be used to select an appropriate function to use as the attribute provider, and how to configure the function signature.
Since the feature is enable by default when
rs_bitcode
is used, the existing testsuite ensure the feature is generating the same result as the previous API.Checklist:
already run clang-format v17 before submitting, I definitely will look at
the CI test that runs clang-format and fix anything that it highlights as
being nonconforming.