-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom OP wrapper execution provider #13216
Conversation
…n for use in custom ops.
@@ -77,6 +77,8 @@ template <typename T> | |||
#ifdef ORT_API_MANUAL_INIT | |||
const OrtApi* Global<T>::api_{}; | |||
inline void InitApi() { Global<void>::api_ = OrtGetApiBase()->GetApi(ORT_API_VERSION); } | |||
inline void InitApi(const OrtApi* api) { Global<void>::api_ = api; } | |||
inline bool ApiIsInit() { return Global<void>::api_ != nullptr; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to discuss the best way to handle this. See: #12998 (comment)
/// a KernelInfo object. Use it to access the name or type info of an input | ||
/// or output. | ||
/// </summary> | ||
struct NodeArg { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update to use new Ort::Base<> impl if #13215 is merged first
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exposes the graph. Need to carefully consider the implications
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is one of my concerns. I saw that the python bindings also expose NodeArg (albeit in a more limited fashion) so I felt compelled to try it this way to see what you'd think. https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/onnxruntime_pybind_state.cc#L1365
If we don't feel good about exposing NodeArg, then one alternative is to create more apis on top of KernelInfo to accomplish the same tasks. EX:
- KernelInfo_GetInputName(..., size_t index, ...)
- KernelInfo_GetOutputName(..., size_t index, ..)
- KernelInfo_GetInputTypeInfo(..., size_t, index, ..)
- KernelInfo_GetOutputTypeInfo(..., size_t, index, ..)
The drawback is that this requires more APIs and that we have to keep passing in the index.
Another alternative is to allocate an entirely new object that only has the data that we need. We don't have to return a pointer to an internal onnxruntime::NodeArg. Instead, we return a pointer to a new, more limited, object. I think I may like this more.
extern "C" { | ||
#endif | ||
// Disable C++ linter in this file | ||
// NOLINTBEGIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove this now that clang-tidy has been disabled
INPUT_OUTPUT_REQUIRED = 0, | ||
INPUT_OUTPUT_OPTIONAL, | ||
INPUT_OUTPUT_VARIADIC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to think more about any repercussions this may have.
@@ -59,6 +59,7 @@ option(onnxruntime_USE_NNAPI_BUILTIN "Build with builtin NNAPI lib for Android N | |||
option(onnxruntime_USE_SNPE "Build with SNPE support" OFF) | |||
option(onnxruntime_USE_RKNPU "Build with RKNPU support" OFF) | |||
option(onnxruntime_USE_DNNL "Build with DNNL support" OFF) | |||
option(onnxruntime_USE_OPWRAPPER "Build with operator EP wrapper support" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: Need to think about how this should work in minimal builds.
// Ex: auto name = node_arg.GetName(allocator); | ||
// name.first.get(); // Get char* | ||
// name.second; // Get length | ||
std::pair<AllocatedStringPtr, size_t> GetName(OrtAllocator* allocator) const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Allocating names using ORT allocator is a strange practice. Certainly should not come up to CPP level. Make a copy of it to std::string and deallocate within the CPP wrapper so user does not have these headaches. Need to discuss.
Otherwise, if we want to the allocator, then make it zero terminated to avoid std::pair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I've saw that previous APIs handled returning names in various ways, and I wasn't quite sure which was the current accepted practice.
One method (A) requires the user to first query for the name's size (by passing nullptr), then the user allocates memory, and finally calls again to have the API fill in the name. The other method (B) just passes in an allocator, which the user must use to free the data.
I can change this to use method A.
template <typename T> // T is only implemented for std::vector<float>, std::vector<int64_t> | ||
std::vector<T> GetAttributes(const char* name) const; | ||
|
||
[[nodiscard]] size_t GetInputCount() const; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove [[discard]]
// Returns the node's type and shape information. | ||
[[nodiscard]] TypeInfo GetTypeInfo() const; | ||
|
||
constexpr explicit operator const OrtNodeArg*() const noexcept { return p_; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: Ignore this. I misunderstood. I'll change to get
or get rid of the explicit.
That's what I thought too, but I did it this way in preparation for using the new detail::Base<>
class in your PR, which has constexpr operator contained_type*() const noexcept { return p_; }
. I wanted to make sure that this class was used in the same way to ease merging.
But for now, I can change it to get()
since it's really not a big deal to change it later.
Let's not use any of C API in the CPP code. PR description exposes lib handle that is not owned. |
We chatted and came away with the following implementation notes: Try to use existing
|
This PR will change significantly, so need to review for now (unless you want to :)). |
New implementation: #13427 |
Description
Draft.
(AKA: lighweight EPs)
Adds the below C APIs to support custom ops that wrap an entire model to be inferenced with an external provider. The current SNPE EP is an example of an EP that could be ported to use a custom op wrapper EP. Ex: The custom op stores the serialized SNPE DLC binary as a string attribute. The SNPE model is built when the kernel is created. The model is inferenced with SNPE APIs on call to the kernel's compute method.
C APIs
KernelInfo_GetInputCount
OrtKernelInfo
.KernelInfo_GetOutputCount
OrtKernelInfo
.KernelInfo_GetInputNodeArg
OrtNodeArg
(casted fromonnxruntime::NodeArg
), which can be used to retrieve an input's name, element type, and shape.KernelInfo_GetOutputNodeArg
OrtNodeArg
(casted fromonnxruntime::NodeArg
), which can be used to retrieve an output's name, element type, and shape.NodeArg_GetName
NodeArg_GetTypeInfo
1: SNPE is an example of an EP that needs to be able to query
KernelInfo
for the name, type, and shape of inputs and outputs in order to build the model from the serialized DLC data. Other providers (e.g., OpenVINO) are able to query i/o info from the serialized model, so they do not strictly need these APIs. However, the APIs can still be used to validate the expected I/O characteristics.EP-Specific C APIs
Stored in a separate API struct that is retrieved with
OrtApi::GetExecutionProviderApi
(similar to DML EP).SessionOptionsAppendExecutionProvider
CreateProviderOptions
KernelInfo_GetProviderOptions
OpWrapperExecutionProvider
.ProviderOptions_Update
ProviderOptions_Serialize
ProviderOptions_HasOption
ProviderOptions_GetOption
ReleaseProviderOptions
OrtOpWrapperProviderOptions
object.CreateProviderOptions
andKernelInfo_GetProviderOptions
.New EP
Adds a new execution provider type tentatively called
opwrapper
. This is currently a light EP that allows custom ops to query provider options.Example of usage in a custom OP: microsoft/onnxruntime-inference-examples#150
How to create a session:
Motivation and Context
Allows creation of simple "wrapper" EPs outside of the main ORT code base.
TODO