Prometheus metrics may have done aggregation. #28

wangchen615 · 2021-04-23T17:07:51Z

The current load-watcher Prometheus pkg was using the metric of instance:node_cpu:ratio to calculate the node utilization
However, when this value is still below 60%, I found another metric instance:node_cpu_utilisation:rate1m was very large and was around 90%. Apparently, the Prometheus metric had some smoothing for the metric, and the one we used may already have a smoothing over a large time window, which might be larger than 1m. Let's guess for 5m.

We are not sure which Prometheus metric is consistent with the metric obtained directly from the metric server, so there needs more testing.

The text was updated successfully, but these errors were encountered:

WLBF · 2022-02-08T10:17:42Z

instance:node_cpu:ratio metric's time window is 5m.

sum(rate(node_cpu_seconds_total{mode!="idle",mode!="iowait",mode!="steal"}[5m])) WITHOUT (cpu, mode) / ON(instance) GROUP_LEFT() count(sum(node_cpu_seconds_total) BY (instance, cpu)) BY (instance)

https://github.com/prometheus-operator/kube-prometheus/blob/7a3879ba49bfe8df5eec03847fe5bcd2d4094c73/jsonnet/kube-prometheus/components/mixin/rules/node-rules.libsonnet#L20

wangchen615 added enhancement New feature or request question Further information is requested labels Apr 23, 2021

wangchen615 self-assigned this Apr 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus metrics may have done aggregation. #28

Prometheus metrics may have done aggregation. #28

wangchen615 commented Apr 23, 2021 •

edited

Loading

WLBF commented Feb 8, 2022

Prometheus metrics may have done aggregation. #28

Prometheus metrics may have done aggregation. #28

Comments

wangchen615 commented Apr 23, 2021 • edited Loading

WLBF commented Feb 8, 2022

wangchen615 commented Apr 23, 2021 •

edited

Loading