From 484c1095ec4a99efcd97af428dd5edb4fedf55c1 Mon Sep 17 00:00:00 2001 From: Heather Halter Date: Tue, 13 Feb 2024 11:11:52 -0800 Subject: [PATCH] Add searchbp metrics to Performance Analyzer (#5390) * added searchbp metrics Signed-off-by: Heather Halter * Update reference.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Update reference.md Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Nathan Bower Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> --------- Signed-off-by: Heather Halter Signed-off-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Naarcha-AWS <97990722+Naarcha-AWS@users.noreply.github.com> Co-authored-by: Nathan Bower --- _monitoring-your-cluster/pa/reference.md | 184 ++++++++++++++++++++--- 1 file changed, 165 insertions(+), 19 deletions(-) diff --git a/_monitoring-your-cluster/pa/reference.md b/_monitoring-your-cluster/pa/reference.md index c06d59de38..8b076b1ba5 100644 --- a/_monitoring-your-cluster/pa/reference.md +++ b/_monitoring-your-cluster/pa/reference.md @@ -743,27 +743,173 @@ The following metrics are relevant to the cluster as a whole and do not require +## Relevant dimensions: `NodeID`, `searchbp_mode` + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
MetricDescription
SearchBP_Shard_Stats_CancellationCount + The number of tasks marked for cancellation at the shard task level. +
SearchBP_Shard_Stats_LimitReachedCount + The number of times that the cancellable task total exceeded the set cancellation threshold at the shard task level. +
SearchBP_Shard_Stats_Resource_Heap_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive heap usage since the node last restarted at the shard task level. +
SearchBP_Shard_Stats_Resource_Heap_Usage_CurrentMax + The maximum heap usage for tasks currently running at the shard task level. +
SearchBP_Shard_Stats_Resource_Heap_Usage_RollingAvg + The rolling average heap usage for the _n_ most recent tasks at the shard task level. The default value for _n_ is `100`. +
SearchBP_Shard_Stats_Resource_CPU_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive CPU usage since the node last restarted at the shard task level. +
SearchBP_Shard_Stats_Resource_CPU_Usage_CurrentMax + The maximum CPU time for all tasks currently running on the node at the shard task level. +
SearchBP_Shard_Stats_Resource_CPU_Usage_CurrentAvg + The average CPU time for all tasks currently running on the node at the shard task level. +
SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive time elapsed since the node last restarted at the shard task level. +
SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CurrentMax + The maximum time elapsed for all tasks currently running on the node at the shard task level. +
SearchBP_Shard_Stats_Resource_ElaspedTime_Usage_CurrentAvg + The average time elapsed for all tasks currently running on the node at the shard task level. +
Searchbp_Task_Stats_CancellationCount + The number of tasks marked for cancellation at the search task level. +
SearchBP_Task_Stats_LimitReachedCount + The number of times that the cancellable task total exceeded the set cancellation threshold at the search task level. +
SearchBP_Task_Stats_Resource_Heap_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive heap usage since the node last restarted at the search task level. +
SearchBP_Task_Stats_Resource_Heap_Usage_CurrentMax + The maximum heap usage for tasks currently running at the search task level. +
SearchBP_Task_Stats_Resource_Heap_Usage_RollingAvg + The rolling average heap usage for the _n_ most recent tasks at the search task level. The default value for _n_ is `10`. +
SearchBP_Task_Stats_Resource_CPU_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive CPU usage since the node last restarted at the search task level. +
SearchBP_Task_Stats_Resource_CPU_Usage_CurrentMax + The maximum CPU time for all tasks currently running on the node at the search task level. +
SearchBP_Task_Stats_Resource_CPU_Usage_CurrentAvg + The average CPU time for all tasks currently running on the node at the search task level. +
SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CancellationCount + The number of tasks marked for cancellation because of excessive time elapsed since the node last restarted at the search task level. +
SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CurrentMax + The maximum time elapsed for all tasks currently running on the node at the search task level. +
SearchBP_Task_Stats_Resource_ElaspedTime_Usage_CurrentAvg + The average time elapsed for all tasks currently running on the node at the search task level. +
+ ## Dimensions reference | Dimension | Return values | |----------------------|-------------------------------------------------| -| ShardID | The ID of the shard, for example, `1`. | -| IndexName | The name of the index, for example, `my-index`. | -| Operation | The type of operation, for example, `shardbulk`. | -| ShardRole | The shard role, for example, `primary` or `replica`. | -| Exception | OpenSearch exceptions, for example, `org.opensearch.index_not_found_exception`. | -| Indices | The list of indexes in the request URL. | -| HTTPRespCode | The response code from OpenSearch, for example, `200`. | -| MemType | The memory type, for example, `totYoungGC`, `totFullGC`, `Survivor`, `PermGen`, `OldGen`, `Eden`, `NonHeap`, or `Heap`. | -| DiskName | The name of the disk, for example, `sda1`. | -| DestAddr | The destination address, for example, `010015AC`. | -| Direction | The direction, for example, `in` or `out`. | -| ThreadPoolType | The OpenSearch thread pools, for example, `index`, `search`, or `snapshot`. | -| CBType | The circuit breaker type, for example, `accounting`, `fielddata`, `in_flight_requests`, `parent`, or `request`. | -| ClusterManagerTaskInsertOrder| The order in which the task was inserted, for example, `3691`. | -| ClusterManagerTaskPriority | The priority of the task, for example, `URGENT`. OpenSearch executes higher-priority tasks before lower-priority ones, regardless of `insert_order`. | -| ClusterManagerTaskType | The task type, for example, `shard-started`, `create-index`, `delete-index`, `refresh-mapping`, `put-mapping`, `CleanupSnapshotRestoreState`, or `Update snapshot state`. | -| ClusterManagerTaskMetadata | The metadata for the task (if any). | -| CacheType | The cache type, for example, `Field_Data_Cache`, `Shard_Request_Cache`, or `Node_Query_Cache`. | - +| `ShardID` | The ID of the shard, for example, `1`. | +| `IndexName` | The name of the index, for example, `my-index`. | +| `Operation` | The type of operation, for example, `shardbulk`. | +| `ShardRole` | The shard role, for example, `primary` or `replica`. | +| `Exception` | OpenSearch exceptions, for example, `org.opensearch.index_not_found_exception`. | +| `Indices` | The list of indexes in the request URL. | +| `HTTPRespCode` | The OpenSearch response code, for example, `200`. | +| `MemType` | The memory type, for example, `totYoungGC`, `totFullGC`, `Survivor`, `PermGen`, `OldGen`, `Eden`, `NonHeap`, or `Heap`. | +| `DiskName` | The name of the disk, for example, `sda1`. | +| `DestAddr` | The destination address, for example, `010015AC`. | +| `Direction` | The direction, for example, `in` or `out`. | +| `ThreadPoolType` | The OpenSearch thread pools, for example, `index`, `search`, or `snapshot`. | +| `CBType` | The circuit breaker type, for example, `accounting`, `fielddata`, `in_flight_requests`, `parent`, or `request`. | +| `ClusterManagerTaskInsertOrder`| The order in which the task was inserted, for example, `3691`. | +| `ClusterManagerTaskPriority` | The priority of the task, for example, `URGENT`. OpenSearch executes higher-priority tasks before lower-priority ones, regardless of `insert_order`. | +| `ClusterManagerTaskType` | The task type, for example, `shard-started`, `create-index`, `delete-index`, `refresh-mapping`, `put-mapping`, `CleanupSnapshotRestoreState`, or `Update snapshot state`. | +| `ClusterManagerTaskMetadata` | The metadata for the task (if any). | +| `CacheType` | The cache type, for example, `Field_Data_Cache`, `Shard_Request_Cache`, or `Node_Query_Cache`. | +| `NodeID` | The ID of the node. | +| `Searchbp_mode` | The search backpressure mode, for example, `monitor_only` (default), `enforced`, or `disabled`. |