NNAPI usage via onnxruntime #10692

ashwinjoseph95 · 2022-02-28T09:07:45Z

ashwinjoseph95
Feb 28, 2022

Hi Team,
I would like to enquire how the Onnxruntime does execution in the backend upon setting the NNAPI related flags as mentioned in the link https://onnxruntime.ai/docs/execution-providers/NNAPI-ExecutionProvider.html .
My Android device (My Android API version for the device is 30 and ABI is arm64-v8a so I guess its supported) uses a chip that can utilize CPU, GPU and DSP. My query is if I use the flag NNAPI_FLAG_CPU_DISABLED , which runtime does Onnxruntime distribute the execution to? What all are the factors involved in switching? Is it possible to set one runtime over another? Say I want to get the execution done preferably with DSP over GPU.

I also had another query if it is necessary to quantize the onnx models if I have to utilize DSP runtime via NNAPI and Onnxruntime. Other platform Runtime Platforms do instruct to do so.

Answered by skottmckay

Aug 24, 2023

The ORT NNAPI EP will be preferred over the ORT CPU EP when assigning nodes in the model.

If the NNAPI EP can handle a specific operator ('handle' meaning convert to the equivalent NNAPI operator), nodes involving that operator will be assigned to the NNAPI EP. Any remaining nodes will be handled by the ORT CPU EP.

The NNAPI EP will create NNAPI model/s from the nodes assigned to it at runtime. We don't have much control over how NNAPI itself will execute the NNAPI model we create. It will internally pick whatever it thinks is best.

Setting the NNAPI_FLAG_CPU_DISABLED flag will prevent NNAPI from running operators in the NNAPI model that only have a CPU implementation. If there are operat…

View full answer

Uwenisme · 2023-08-16T02:17:14Z

Uwenisme
Aug 16, 2023

Hi, you know to select different devices now? Could you share your solutions?

0 replies

skottmckay · 2023-08-24T09:19:58Z

skottmckay
Aug 24, 2023
Collaborator

The ORT NNAPI EP will be preferred over the ORT CPU EP when assigning nodes in the model.

If the NNAPI EP can handle a specific operator ('handle' meaning convert to the equivalent NNAPI operator), nodes involving that operator will be assigned to the NNAPI EP. Any remaining nodes will be handled by the ORT CPU EP.

The NNAPI EP will create NNAPI model/s from the nodes assigned to it at runtime. We don't have much control over how NNAPI itself will execute the NNAPI model we create. It will internally pick whatever it thinks is best.

Setting the NNAPI_FLAG_CPU_DISABLED flag will prevent NNAPI from running operators in the NNAPI model that only have a CPU implementation. If there are operators like that in the NNAPI model the NNAPI model creation will fail. ORT does not fall back to using the ORT CPU EP for that node, as it has already been included in the NNAPI model.

Basically that means the NNAPI_FLAG_CPU_DISABLED flag is good for testing if the nodes assigned to NNAPI will run well (the NNAPI fallback CPU implementation is not optimized in any way).

FWIW NNAPI performance differs dramatically across devices. Your best bet is to run the model both with the NNAPI EP enabled and disabled and pick whichever option is best on a per-device basis.

Whether you need to quantize or not will most likely depend on the hardware in the particular device. The NNAPI spec supports both quantized and float data types in general, but the device implementation of the spec may only support quantized types.

e.g. if the NNAPI implementation for the particular DSP only supports 8-bit data types, you'll need to quantize the model to use that. that doesn't mean NNAPI can't be used - for example the GPU on the device may handle float data types and your model could potentially be executed by NNAPI using the GPU instead of the DSP.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NNAPI usage via onnxruntime #10692

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

Select a reply

NNAPI usage via onnxruntime #10692

ashwinjoseph95 Feb 28, 2022

Replies: 2 comments

Uwenisme Aug 16, 2023

skottmckay Aug 24, 2023 Collaborator

ashwinjoseph95
Feb 28, 2022

Uwenisme
Aug 16, 2023

skottmckay
Aug 24, 2023
Collaborator