Skip to content

Commit

Permalink
Add session and run option workload_type for applications to set effi…
Browse files Browse the repository at this point in the history
…cient mode. (#21781)

### Description
This PR added session and run option workload_type, this option is the
knob for applications to enable/disable the processor performance
efficient mode.



### Motivation and Context
The efficient mode is co-engineered with processor vendors to allow
applications voluntarily being serviced at a more energy efficient
performance level. This functionality can be used by long running,
latency insensitive application to save the energy consumption.
  • Loading branch information
AlbertGuan12345 authored Aug 28, 2024
1 parent e952774 commit ef073fd
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -49,3 +49,8 @@ static const char* const kOrtRunOptionsConfigQnnRpcControlLatency = "qnn.rpc_con
// If the value is set to -1, cuda graph capture/replay is disabled in that run.
// User are not expected to set the value to 0 as it is reserved for internal use.
static const char* const kOrtRunOptionsConfigCudaGraphAnnotation = "gpu_graph_id";

// Specify the type of workload for this run.
// “Default”: OS determines the scheduling priority and processor performance to service this workload. [Default]
// “Efficient”: OS treats this workload is efficiency oriented with low scheduling priority and efficient processor performance.
static const char* const kOrtRunOptionsWorkloadType = "run.workload_type";
Original file line number Diff line number Diff line change
Expand Up @@ -279,3 +279,8 @@ static const char* const kOrtSessionOptionsMlasGemmFastMathArm64Bfloat16 = "mlas
// Refer to MatMulNBits op schema for more details.
// If not provided, default is 4.
static const char* const kOrtSessionOptionsQDQMatMulNBitsAccuracyLevel = "session.qdq_matmulnbits_accuracy_level";

// Specify the type of workload for this session.
// “Default”: OS determines the scheduling priority and processor performance to service this workload. [Default]
// “Efficient”: OS treats this workload is efficiency oriented with low scheduling priority and efficient processor performance.
static const char* const kOrtSessionOptionsWorkloadType = "session.workload_type";

0 comments on commit ef073fd

Please sign in to comment.