When using clustering, exporters may not work correctly due to `instance` label #1009

thampiotr · 2024-06-10T10:24:28Z

What's wrong?

Most embedded Prometheus exporters set the instance label to the hostname where Alloy runs.

This breaks in a subtle, but significant way, the fundamental clustering assumption that all instances have the same configuration. The exporters implicitly inject the hostname as an instance label, but instances usually have different hostnames. This leads to either no scraping of metrics at all, or duplicate scraping with different instance labels (unnecessary).

Steps to reproduce

Run any exporter in a clustered mode in a cluster of 2+ instances, each running on a different host. Have scraping set up with clustering and a remote write to a metrics DB.
Observe that some targets will not be scraped at all, some will be scraped multiple times, with different instance labels.
Observe in the UI that instance label is different in exporters' targets between instances, indicating different series.

The issue was discussed in this PR, but decided to move the conversation here for better tracking and to provide a place to refer to for workarounds.

The text was updated successfully, but these errors were encountered:

thampiotr · 2024-06-10T10:28:37Z

There is a workaround for now: set the instance label to a common value for all instances in the cluster, using discovery.relabel component. For example, this component sets it to "alloy-cluster":

discovery.relabel "replace_instance" {
  targets = discovery.file.targets.targets
  rule {
    action        = "replace"
    source_labels = ["instance"]
    target_label  = "instance"
	replacement   = "alloy-cluster"
  }  
}

You'd add the above component between your exporters and the prometheus.scrape.

Longer term fix can be also achieved via #399. Regardless, we should have good documentation to ensure users don't fall into this pit.

github-actions · 2024-07-11T00:01:32Z

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

rgarrigue · 2024-08-23T14:22:48Z

Affected our 30+ blackbox probes over ~7 alloy deployed via Helm, we were missing out on random targets, triggering DatasourceNoData in our alerting. The workaround fixed it.

st-akorotkov · 2024-10-14T11:59:03Z

TBH this proposed rule is not a workaround. Using it breaks multiple dashboard and alert since we can't distinguish nodes running node-exporter anymore.

thampiotr added the bug Something isn't working label Jun 10, 2024

This was referenced Jun 10, 2024

Tracking: Address Clustering Issues #784

Open

When scraping using Alloy clustering mode, if there are more than 3 replicas, a duplicate label error occurs. #1006

Closed

github-actions bot added the needs-attention label Jul 11, 2024

thampiotr self-assigned this Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When using clustering, exporters may not work correctly due to `instance` label #1009

When using clustering, exporters may not work correctly due to `instance` label #1009

thampiotr commented Jun 10, 2024 •

edited

Loading

thampiotr commented Jun 10, 2024

github-actions bot commented Jul 11, 2024

rgarrigue commented Aug 23, 2024

st-akorotkov commented Oct 14, 2024

When using clustering, exporters may not work correctly due to instance label #1009

When using clustering, exporters may not work correctly due to instance label #1009

Comments

thampiotr commented Jun 10, 2024 • edited Loading

What's wrong?

Steps to reproduce

thampiotr commented Jun 10, 2024

github-actions bot commented Jul 11, 2024

rgarrigue commented Aug 23, 2024

st-akorotkov commented Oct 14, 2024

When using clustering, exporters may not work correctly due to `instance` label #1009

When using clustering, exporters may not work correctly due to `instance` label #1009

thampiotr commented Jun 10, 2024 •

edited

Loading