Streaming aggregation config files as CRDs in VM operator for K8s #788

tanisdlj · 2023-10-20T13:40:05Z

Is your feature request related to a problem? Please describe

Having CRDs to scrape different services is great, as we can have on each microservice repository a folder, for instance, /monitoring, where developers can have their vmservicescrapes CRDs that are later deployed together with the service in K8s and read by vmagent, hence start the scrape of each new service.

The problem is that, if the developer wants to reduce cardinality of the metrics scraped with some Streaming Aggregation config, right now the only way to do it is by changing the vmagent deploy to add in a single file all the streaming aggregation configs with an arg, like

-remoteWrite.streamAggr.config=/etc/vm/stream-aggr-configs/stream-aggr-configs.config

This stops the developers to control their own metrics and leaves the team in charge of the vmagent deploy to maintain and "gatekeep" all the configuration regarding all the services in a company in a single (and possibly massive) file.

Describe the solution you'd like

The same way we have CRDs for vmnodescrapes vmpodscrapes and vmservicescrapes, we should be able to deploy vmstreamaggrconfigs.

vmagent can then read the CRDs and apply the configs the same way is doing with the scrape configs.

Describe alternatives you've considered

We created a separate repository where the developers can drop files with their stream aggregation config files, then the CI merges all the files into a single one that is deployed as a ConfigMap in Kubernetes and mounted in a dir with an specific filename where, by default, vmagent is reading with -remoteWrite.streamAggr.config=/etc/vm/stream-aggr-configs/stream-aggr-configs.config.

This doesn't really solve the issue but at least mitigates it so developers doesn't have to do changes in the vm k8s operator code.

We still have to gatekeep the code and will have a massive file sooner than later.

Additional information

An "easy" solution will be applying a similar approach that VM already have with servicescrapes

The text was updated successfully, but these errors were encountered:

Haleygo · 2023-10-24T07:36:35Z

Thanks for the idea!
remoteWrite.streamAggr.config is an array used for remoteWrite.url, so vmstreamaggrconfigs will also need field like targets telling operator to generate streamaggr config for which remotewrite endpoint defined in vmagent. So there can be complicated case, like when different remotewrite endpoint wants different config:

cc @f41gh7 @hagen1778

hagen1778 · 2023-10-24T07:58:19Z

cc @Amper

tanisdlj · 2023-10-24T15:07:36Z

An idea: You can define which targets with some label or annotation. If there is a single target, then all annotation applies. If there are multiple ones, then target: ep1 could be a solution, and if no annotation or label is used then is ignored.

yuvalavidor · 2024-01-30T15:29:38Z

I agree with this approach, when using the default k8s-stack. in order to add streaming aggregation you need to completely change the array element of the remote write. for those who are using a values.yaml override, that creates a need to hard code the url in all environments.

for the default one we can maybe create an "additionalStreaminAggregations" value or something of the sort.
maybe this comment is more sutiable for the helm-chart repo...

f41gh7 · 2024-03-06T09:41:10Z

I don't think it's possible to delegate streamAggr configuration to the scrapping definition, because that configuration is not configured and applied per scrapping job at vmagent.

It also unclear, how to merge conflicting configuration.

For instance,

VMServiceScrape-1 has the following config:

job: scrape-job-name-1
Aggr:
- match: 'http_requests_total'
  interval: 30s
  without: [path, user]
  outputs: [total]

VMServiceScrape-2 has the following config:

job: scrape-job-name-2
Aggr:
- match: 'http_requests_total'
  interval: 7s
  without: [path, user,host]
  outputs: [total]

Even if match could be changed from match: 'http_requests_total' to match: 'http_requests_total{job="scrape-job-name-1"}'. It'll still produce metric based on http_requests_total with a different set of labels and interval.

It doesn't look like good approach to me.

Also, it requires from users to have a knowledge of other streaming aggregation configuration.

tanisdlj added the enhancement New feature or request label Oct 20, 2023

Amper transferred this issue from VictoriaMetrics/VictoriaMetrics Oct 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streaming aggregation config files as CRDs in VM operator for K8s #788

Streaming aggregation config files as CRDs in VM operator for K8s #788

tanisdlj commented Oct 20, 2023

Haleygo commented Oct 24, 2023

hagen1778 commented Oct 24, 2023

tanisdlj commented Oct 24, 2023

yuvalavidor commented Jan 30, 2024

f41gh7 commented Mar 6, 2024

Streaming aggregation config files as CRDs in VM operator for K8s #788

Streaming aggregation config files as CRDs in VM operator for K8s #788

Comments

tanisdlj commented Oct 20, 2023

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Describe alternatives you've considered

Additional information

Haleygo commented Oct 24, 2023

hagen1778 commented Oct 24, 2023

tanisdlj commented Oct 24, 2023

yuvalavidor commented Jan 30, 2024

f41gh7 commented Mar 6, 2024