Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming aggregation config files as CRDs in VM operator for K8s #788

Open
tanisdlj opened this issue Oct 20, 2023 · 5 comments
Open

Streaming aggregation config files as CRDs in VM operator for K8s #788

tanisdlj opened this issue Oct 20, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@tanisdlj
Copy link
Contributor

Is your feature request related to a problem? Please describe

Having CRDs to scrape different services is great, as we can have on each microservice repository a folder, for instance, /monitoring, where developers can have their vmservicescrapes CRDs that are later deployed together with the service in K8s and read by vmagent, hence start the scrape of each new service.

The problem is that, if the developer wants to reduce cardinality of the metrics scraped with some Streaming Aggregation config, right now the only way to do it is by changing the vmagent deploy to add in a single file all the streaming aggregation configs with an arg, like

-remoteWrite.streamAggr.config=/etc/vm/stream-aggr-configs/stream-aggr-configs.config

This stops the developers to control their own metrics and leaves the team in charge of the vmagent deploy to maintain and "gatekeep" all the configuration regarding all the services in a company in a single (and possibly massive) file.

Describe the solution you'd like

The same way we have CRDs for vmnodescrapes vmpodscrapes and vmservicescrapes, we should be able to deploy vmstreamaggrconfigs.

vmagent can then read the CRDs and apply the configs the same way is doing with the scrape configs.

Describe alternatives you've considered

We created a separate repository where the developers can drop files with their stream aggregation config files, then the CI merges all the files into a single one that is deployed as a ConfigMap in Kubernetes and mounted in a dir with an specific filename where, by default, vmagent is reading with -remoteWrite.streamAggr.config=/etc/vm/stream-aggr-configs/stream-aggr-configs.config.

This doesn't really solve the issue but at least mitigates it so developers doesn't have to do changes in the vm k8s operator code.

We still have to gatekeep the code and will have a massive file sooner than later.

Additional information

An "easy" solution will be applying a similar approach that VM already have with servicescrapes

@tanisdlj tanisdlj added the enhancement New feature or request label Oct 20, 2023
@Amper Amper transferred this issue from VictoriaMetrics/VictoriaMetrics Oct 23, 2023
@Haleygo
Copy link
Contributor

Haleygo commented Oct 24, 2023

Thanks for the idea!
remoteWrite.streamAggr.config is an array used for remoteWrite.url, so vmstreamaggrconfigs will also need field like targets telling operator to generate streamaggr config for which remotewrite endpoint defined in vmagent. So there can be complicated case, like when different remotewrite endpoint wants different config:
image
cc @f41gh7 @hagen1778

@hagen1778
Copy link
Contributor

cc @Amper

@tanisdlj
Copy link
Contributor Author

An idea: You can define which targets with some label or annotation. If there is a single target, then all annotation applies. If there are multiple ones, then target: ep1 could be a solution, and if no annotation or label is used then is ignored.

@yuvalavidor
Copy link

I agree with this approach, when using the default k8s-stack. in order to add streaming aggregation you need to completely change the array element of the remote write. for those who are using a values.yaml override, that creates a need to hard code the url in all environments.

for the default one we can maybe create an "additionalStreaminAggregations" value or something of the sort.
maybe this comment is more sutiable for the helm-chart repo...

@f41gh7
Copy link
Collaborator

f41gh7 commented Mar 6, 2024

I don't think it's possible to delegate streamAggr configuration to the scrapping definition, because that configuration is not configured and applied per scrapping job at vmagent.

It also unclear, how to merge conflicting configuration.

For instance,

VMServiceScrape-1 has the following config:

job: scrape-job-name-1
Aggr:
- match: 'http_requests_total'
  interval: 30s
  without: [path, user]
  outputs: [total]

VMServiceScrape-2 has the following config:

job: scrape-job-name-2
Aggr:
- match: 'http_requests_total'
  interval: 7s
  without: [path, user,host]
  outputs: [total]

Even if match could be changed from match: 'http_requests_total' to match: 'http_requests_total{job="scrape-job-name-1"}'. It'll still produce metric based on http_requests_total with a different set of labels and interval.

It doesn't look like good approach to me.

Also, it requires from users to have a knowledge of other streaming aggregation configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants