Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom OP wrapper execution provider #13216

Closed
wants to merge 45 commits into from
Closed

Conversation

adrianlizarraga
Copy link
Contributor

@adrianlizarraga adrianlizarraga commented Oct 4, 2022

Description

Draft.

(AKA: lighweight EPs)

Adds the below C APIs to support custom ops that wrap an entire model to be inferenced with an external provider. The current SNPE EP is an example of an EP that could be ported to use a custom op wrapper EP. Ex: The custom op stores the serialized SNPE DLC binary as a string attribute. The SNPE model is built when the kernel is created. The model is inferenced with SNPE APIs on call to the kernel's compute method.

C APIs

API Description Why
KernelInfo_GetInputCount Gets number of inputs from OrtKernelInfo. Query I/O characteristics during kernel creation1
KernelInfo_GetOutputCount Gets number of outputs from OrtKernelInfo. Query I/O characteristics during kernel creation1
KernelInfo_GetInputNodeArg Gets a read-only handle to a OrtNodeArg (casted from onnxruntime::NodeArg), which can be used to retrieve an input's name, element type, and shape. Query I/O characteristics during kernel creation1
KernelInfo_GetOutputNodeArg Gets a read-only handle to a OrtNodeArg (casted from onnxruntime::NodeArg), which can be used to retrieve an output's name, element type, and shape. Query I/O characteristics during kernel creation1
NodeArg_GetName Gets the name of an input or output. Query I/O characteristics during kernel creation1
NodeArg_GetTypeInfo Gets the type/shape information of an input or output. Query I/O characteristics during kernel creation1

1: SNPE is an example of an EP that needs to be able to query KernelInfo for the name, type, and shape of inputs and outputs in order to build the model from the serialized DLC data. Other providers (e.g., OpenVINO) are able to query i/o info from the serialized model, so they do not strictly need these APIs. However, the APIs can still be used to validate the expected I/O characteristics.

EP-Specific C APIs

Stored in a separate API struct that is retrieved with OrtApi::GetExecutionProviderApi (similar to DML EP).

API Description Why
SessionOptionsAppendExecutionProvider Appends OpWrapper EP to session and adds options for one or more custom op providers. Configure underlying provider(s) with session-level options.
CreateProviderOptions Returns an opaque provider options object for a custom op provider. Configure underlying provider(s) with session-level options.
KernelInfo_GetProviderOptions Gets a custom op provider's options. The KernelInfo's EP must be of type OpWrapperExecutionProvider. Retrieve custom op provider's options from the kernel construction function.
ProviderOptions_Update Updates one or more options for a custom op provider. Configure underlying provider(s) with session-level options.
ProviderOptions_Serialize Returns the keys and values in the provider options map. Allow API users to get all map keys/values and create their own structures to process the data.
ProviderOptions_HasOption Checks if an option exists. Configure underlying provider(s) with session-level options.
ProviderOptions_GetOption Gets an option's value. Configure underlying provider(s) with session-level options.
ReleaseProviderOptions Releases OrtOpWrapperProviderOptions object. Free objects returned by CreateProviderOptions and KernelInfo_GetProviderOptions.

New EP

Adds a new execution provider type tentatively called opwrapper. This is currently a light EP that allows custom ops to query provider options.

Example of usage in a custom OP: microsoft/onnxruntime-inference-examples#150

How to create a session:

    Ort::Env env;
    Ort::SessionOptions session_opts;

    void* lib_handle = nullptr;
    Ort::ThrowOnError(Ort::GetApi().RegisterCustomOpsLibrary(static_cast<OrtSessionOptions*>(session_opts),
                                                             custom_op_dll_path,
                                                             &lib_handle));
    Ort::OpWrapper::ProviderOptions op_options;
    op_options.UpdateOptions({{"device_type", "CPU"}});

    Ort::OpWrapper::AppendExecutionProvider(session_opts, "OpenVINO_EP_Wrapper", op_options);

    Ort::Session session(env, model_path, session_opts);

Motivation and Context

Allows creation of simple "wrapper" EPs outside of the main ORT code base.

TODO

@@ -77,6 +77,8 @@ template <typename T>
#ifdef ORT_API_MANUAL_INIT
const OrtApi* Global<T>::api_{};
inline void InitApi() { Global<void>::api_ = OrtGetApiBase()->GetApi(ORT_API_VERSION); }
inline void InitApi(const OrtApi* api) { Global<void>::api_ = api; }
inline bool ApiIsInit() { return Global<void>::api_ != nullptr; }
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to discuss the best way to handle this. See: #12998 (comment)

/// a KernelInfo object. Use it to access the name or type info of an input
/// or output.
/// </summary>
struct NodeArg {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will update to use new Ort::Base<> impl if #13215 is merged first

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This exposes the graph. Need to carefully consider the implications

Copy link
Contributor Author

@adrianlizarraga adrianlizarraga Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is one of my concerns. I saw that the python bindings also expose NodeArg (albeit in a more limited fashion) so I felt compelled to try it this way to see what you'd think. https://github.com/microsoft/onnxruntime/blob/main/onnxruntime/python/onnxruntime_pybind_state.cc#L1365

If we don't feel good about exposing NodeArg, then one alternative is to create more apis on top of KernelInfo to accomplish the same tasks. EX:

  • KernelInfo_GetInputName(..., size_t index, ...)
  • KernelInfo_GetOutputName(..., size_t index, ..)
  • KernelInfo_GetInputTypeInfo(..., size_t, index, ..)
  • KernelInfo_GetOutputTypeInfo(..., size_t, index, ..)

The drawback is that this requires more APIs and that we have to keep passing in the index.

Another alternative is to allocate an entirely new object that only has the data that we need. We don't have to return a pointer to an internal onnxruntime::NodeArg. Instead, we return a pointer to a new, more limited, object. I think I may like this more.

extern "C" {
#endif
// Disable C++ linter in this file
// NOLINTBEGIN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove this now that clang-tidy has been disabled

INPUT_OUTPUT_REQUIRED = 0,
INPUT_OUTPUT_OPTIONAL,
INPUT_OUTPUT_VARIADIC,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to think more about any repercussions this may have.

@@ -59,6 +59,7 @@ option(onnxruntime_USE_NNAPI_BUILTIN "Build with builtin NNAPI lib for Android N
option(onnxruntime_USE_SNPE "Build with SNPE support" OFF)
option(onnxruntime_USE_RKNPU "Build with RKNPU support" OFF)
option(onnxruntime_USE_DNNL "Build with DNNL support" OFF)
option(onnxruntime_USE_OPWRAPPER "Build with operator EP wrapper support" OFF)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Need to think about how this should work in minimal builds.

// Ex: auto name = node_arg.GetName(allocator);
// name.first.get(); // Get char*
// name.second; // Get length
std::pair<AllocatedStringPtr, size_t> GetName(OrtAllocator* allocator) const;
Copy link
Member

@yuslepukhin yuslepukhin Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetName(OrtAllocator* allocato

Allocating names using ORT allocator is a strange practice. Certainly should not come up to CPP level. Make a copy of it to std::string and deallocate within the CPP wrapper so user does not have these headaches. Need to discuss.

Otherwise, if we want to the allocator, then make it zero terminated to avoid std::pair.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I've saw that previous APIs handled returning names in various ways, and I wasn't quite sure which was the current accepted practice.

One method (A) requires the user to first query for the name's size (by passing nullptr), then the user allocates memory, and finally calls again to have the API fill in the name. The other method (B) just passes in an allocator, which the user must use to free the data.

I can change this to use method A.

template <typename T> // T is only implemented for std::vector<float>, std::vector<int64_t>
std::vector<T> GetAttributes(const char* name) const;

[[nodiscard]] size_t GetInputCount() const;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[[nodiscard]

We disabled clang tidy. This would require users to have C++17.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove [[discard]]

// Returns the node's type and shape information.
[[nodiscard]] TypeInfo GetTypeInfo() const;

constexpr explicit operator const OrtNodeArg*() const noexcept { return p_; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit

We do not want that. If we did, the practice is to use get()

Copy link
Contributor Author

@adrianlizarraga adrianlizarraga Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: Ignore this. I misunderstood. I'll change to get or get rid of the explicit.

That's what I thought too, but I did it this way in preparation for using the new detail::Base<> class in your PR, which has constexpr operator contained_type*() const noexcept { return p_; }. I wanted to make sure that this class was used in the same way to ease merging.

But for now, I can change it to get() since it's really not a big deal to change it later.

@yuslepukhin
Copy link
Member

Let's not use any of C API in the CPP code. PR description exposes lib handle that is not owned.

@adrianlizarraga
Copy link
Contributor Author

adrianlizarraga commented Oct 6, 2022

We chatted and came away with the following implementation notes:

Try to use existing SessionOptionsAppendExecutionProvider C API

This would eliminate most of the EP-specific APIs. Because this function expects an EP name and a unordered_map<string, string>, we would need to way to target options to specific custom ops.

Ex:

session_opts.AppendExecutionProvider("OpWrapperEP::OpenVINO_Wrapper_OP", {{ "device_type": "CPU" }});
session_opts.AppendExecutionProvider("OpWrapperEP::SNPE_Wrapper_OP", {{ "buffer_type": "float" }});

Note that the EP name is using :: to scope options. There's probably a better way.
Each of the above calls would create a separate instance of the proxy EP.

Bake input/output names, shapes, and element types into node attributes

This would eliminate the need for the following APIs:

  • KernelInfo_GetInputNodeArg
  • KernelInfo_GetOutputNodeArg
  • NodeArg_* (all gone; no more exposing NodeArg)

We may still need KernelInfo_GetInputCount and KernelInfo_GetOutputCount.
We also need to provide scripts/utilities to make it easier to create these wrapper ONNX models.

Rename EP to better indicate that this is a "proxy"

Calling this an EP is confusing. Should also edit the linked example PR description with the full onnx graph image and explain how the node attributes contain the entire model you need to run using other (e.g., OpenVINO, SNPE) apis.

I think these were the major implementation points. @pranavsharma @jywu-msft @yuslepukhin please let me know what I missed. Thanks.

@adrianlizarraga
Copy link
Contributor Author

This PR will change significantly, so need to review for now (unless you want to :)).

@adrianlizarraga
Copy link
Contributor Author

New implementation: #13427

@adrianlizarraga adrianlizarraga deleted the adrianl/custom-op-ep branch October 25, 2022 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants