You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I propose to add mechanism to create templated triggers which will save all current posibilities (such as different trigger settings and triggers parameters [movingAverage time for instance]) without need to create duplicates and maintain its' freshness.
Why (based on our scenarios)
I think it's pretty common that some bunch of triggers are pretty templated, especially if metric paths in graphite is standardized in some way. For example, trigger 5xx errors per client for service '<service-name>' in '<environment-name>' (it's one of our real trigger). Name, target, description, warning/error values, notificaiton schedule, tags - anything has some default value in such templated triggers, and majority of teams is OK with such default settings. Currently, there is no mechanism to describe templated triggers, so users should create many-many duplicates triggers which differ in some metric path parts or trigger parameters. In our case there is special daemon and pretty big configuration (which allow override templated parameters) which read configuration and create new/update outdated/delete not required anymore triggers in Moira. Unfortunately, due to growth of amount of services, virtual and hardware hosts and other things which metrics have been standardized, we reached the maximum amount of trigger in Moira 😢
I took one standardized tool (network related data), which provide 10 types of triggers. There are about 10000 (10 thousands!) triggers for this tool. And only for production environment! Each type have about 1000 duplicates which differ only in one parameter - service name 😓 And "production" trigger differs from "staging" for same service only in one parameter too. So we have templated trigger with only two parameters which now has ~2000 duplicates. I can't count how many triggers use default settings and how many override somethings, but I'm guess there is huge amount triggers with default settings. Also I can't count how many triggers subscribed by someone and how many not, but I'm guess there is some not zero percentage that not needed for anyone (for now or forever) - despite it useless, such triggers take place in Moira and leading us to Moira limits.
Details
I think there should be an ability to create templated trigger with such things:
target variables - parts of target (environment, serivce, etc);
target parameters - target calculation parameters (such as movingMax length, type of consolidateBy, etc)
trigger parameters - anything except target (description, warn/error levels, etc). It may and should use target variables to provide more related data (such as links to Grafana dashboards or tags).
Example of templated trigger syntax
Title: 5xx errors per client for service '$service' in '$environment'
Target: aliasByNode(scale(movingMax(movingSum(consolidateBy(transformNull($environment.services.$service.rpsPerResponseCodeAndClientIdentity.5*.*, 0), '%consolidateBy=max%'), '%movingSum=1min%'), '%movingMax=2min%'), 10), 6, 7)
Description: Trigger alerts when client got '5xx' responses.
Grafana: https://grafana.example.com/dashboard/db/clients?var-env=$environment&var-topo=$service
Tags: Clients.Auto.5xx-per-client, Clients.Auto.$environment, Clients.Auto.$service
Warn/Erorr Values/Schedule/etc Example of templated trigger data overriding
User clicks 'Create inherited trigger' in UI, Moira show usual editor, where:
- target is not editable - if you change trigger target, it's new trigger 🙂;
- tags is append-only or readonly - because some automation can use templated trigger tags;
- readonly list of available target variables are showed (to help user understand what is templated);
- user must define value for one target variable at least;
- editable list of available target parameters are showed;
- any other trigger fields (usually description) - are fully up to user mind.
After saving Moira create "virtual" trigger that acts like that:
- if trigger.%field% is defined - it's value used;
- if trigger.%field% is not defined - value of parent is used.
I think it's not required to have inheritance level more than 1. Example of templated trigger subscribing
User subscribes as usual for tags like Clients.Auto.$environment, Clients.Auto.$service. Moira validates it's templated tag syntax and ask user to define values for variables. For example, $environment = Production, $service = ExampleService.
Example of templated trigger alerting
When Moira gets new metrics, it should validates that it's used as for now. It can be done with 2 approaches for templated triggers:
1. By comparing with all subscriptions (where we have final Graphite metric path without any templating);
2. By changing any $var-name with *.
I'm think first is some kind of better, because there are metrics which look like templated is needed for anyone.
Next we have to define current values for target variables. Because we have target variables names and values. So we can easy determine final trigger that should be used with target variables substituion (to take overrided trigger if it exists). Next we can easy determine all triger fields (from target to tags). Any futher actions is done as now, because we have usual trigger without any templating.
The text was updated successfully, but these errors were encountered:
FEATURE
Summary
I propose to add mechanism to create templated triggers which will save all current posibilities (such as different trigger settings and triggers parameters [movingAverage time for instance]) without need to create duplicates and maintain its' freshness.
Why (based on our scenarios)
I think it's pretty common that some bunch of triggers are pretty templated, especially if metric paths in graphite is standardized in some way. For example, trigger
5xx errors per client for service '<service-name>' in '<environment-name>'
(it's one of our real trigger). Name, target, description, warning/error values, notificaiton schedule, tags - anything has some default value in such templated triggers, and majority of teams is OK with such default settings. Currently, there is no mechanism to describe templated triggers, so users should create many-many duplicates triggers which differ in some metric path parts or trigger parameters. In our case there is special daemon and pretty big configuration (which allow override templated parameters) which read configuration and create new/update outdated/delete not required anymore triggers in Moira. Unfortunately, due to growth of amount of services, virtual and hardware hosts and other things which metrics have been standardized, we reached the maximum amount of trigger in Moira 😢I took one standardized tool (network related data), which provide 10 types of triggers. There are about 10000 (10 thousands!) triggers for this tool. And only for production environment! Each type have about 1000 duplicates which differ only in one parameter - service name 😓 And "production" trigger differs from "staging" for same service only in one parameter too. So we have templated trigger with only two parameters which now has ~2000 duplicates. I can't count how many triggers use default settings and how many override somethings, but I'm guess there is huge amount triggers with default settings. Also I can't count how many triggers subscribed by someone and how many not, but I'm guess there is some not zero percentage that not needed for anyone (for now or forever) - despite it useless, such triggers take place in Moira and leading us to Moira limits.
Details
I think there should be an ability to create templated trigger with such things:
movingMax
length, type ofconsolidateBy
, etc)Example of templated trigger syntax
Title: 5xx errors per client for service '$service' in '$environment'Target:
aliasByNode(scale(movingMax(movingSum(consolidateBy(transformNull($environment.services.$service.rpsPerResponseCodeAndClientIdentity.5*.*, 0), '%consolidateBy=max%'), '%movingSum=1min%'), '%movingMax=2min%'), 10), 6, 7)
Description: Trigger alerts when client got '5xx' responses. Grafana:
https://grafana.example.com/dashboard/db/clients?var-env=$environment&var-topo=$service
Tags:
Clients.Auto.5xx-per-client
,Clients.Auto.$environment
,Clients.Auto.$service
Warn/Erorr Values/Schedule/etc
Example of templated trigger data overriding
User clicks 'Create inherited trigger' in UI, Moira show usual editor, where:-
target
is not editable - if you change trigger target, it's new trigger 🙂;-
tags
is append-only or readonly - because some automation can use templated trigger tags;- readonly list of available target variables are showed (to help user understand what is templated);
- user must define value for one target variable at least;
- editable list of available target parameters are showed;
- any other trigger fields (usually description) - are fully up to user mind.
After saving Moira create "virtual" trigger that acts like that:
- if
trigger.%field%
is defined - it's value used;- if
trigger.%field%
is not defined - value of parent is used.I think it's not required to have inheritance level more than 1.
Example of templated trigger subscribing
User subscribes as usual for tags likeClients.Auto.$environment
,Clients.Auto.$service
. Moira validates it's templated tag syntax and ask user to define values for variables. For example,$environment = Production
,$service = ExampleService
.Example of templated trigger alerting
When Moira gets new metrics, it should validates that it's used as for now. It can be done with 2 approaches for templated triggers:1. By comparing with all subscriptions (where we have final Graphite metric path without any templating);
2. By changing any
$var-name
with*
.I'm think first is some kind of better, because there are metrics which look like templated is needed for anyone.
Next we have to define current values for target variables. Because we have target variables names and values. So we can easy determine final trigger that should be used with target variables substituion (to take overrided trigger if it exists). Next we can easy determine all triger fields (from target to tags). Any futher actions is done as now, because we have usual trigger without any templating.
The text was updated successfully, but these errors were encountered: