Skip to content

Commit

Permalink
Add a ticket level alert for prep volume disk pressure
Browse files Browse the repository at this point in the history
Our prep filesystems can have large and rapid fluctuations
in utilization. They are also often close to full without this
being a problem. This makes alerts challenging. However, it
seems reasonable to receive a ticket level alert for 98% full
on any prep filesystem, regardless of fstype. This alert
as written does rely on prep filesystems having the word
prep in their mount point, but that is almost always the case.
  • Loading branch information
skorner committed Sep 13, 2024
1 parent c318a16 commit 26bc2fc
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions templates/profile/prometheus/rules.yml.erb
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,24 @@ groups:
for: 5m
labels:
severity: page
- alert: PrepDiskPressure
annotations:
summary: 'Prep filesystem {{$labels.hostname}}:{{$labels.mountpoint}} is more than 98% full.'
expr: >
(
(
avg_over_time(
node_filesystem_size_bytes{mountpoint=~"prep"}[1m]
) - avg_over_time(
node_filesystem_avail_bytes[1m]
)
) / avg_over_time(
node_filesystem_size_bytes[1m]
)
) > 0.98
for: 30m
labels:
severity: ticket
- alert: HTDataDenIsFull
annotations:
summary: 'Filesystem {{$labels.hostname}}:{{$labels.mountpoint}} is full.'
Expand Down

0 comments on commit 26bc2fc

Please sign in to comment.