Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add track_timestamps_staleness to prometheus.scrape #5921

Closed
ptodev opened this issue Dec 6, 2023 · 8 comments · Fixed by #5972
Closed

Add track_timestamps_staleness to prometheus.scrape #5921

ptodev opened this issue Dec 6, 2023 · 8 comments · Fixed by #5972
Labels
enhancement New feature or request frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. needs-attention An issue or PR has been sitting around and needs attention.

Comments

@ptodev
Copy link
Contributor

ptodev commented Dec 6, 2023

Request

The track_timestamps_staleness feature was added to Prometheus recently. To pick it up, we would need to upgrade to Prometheus v2.48 or above.

track_timestamps_staleness is false by default in Prometheus, because it is considered a breaking change. We should consider whether it could be set to true in the Agent by default. For example, does setting this to false have any benefits? Are there any implications to alerts which rely on these metrics, and would that be s serious breaking change to users?

Also, we need to go into more detail in the docs about what metrics with explicit timestamps are, and what staleness markers are, so that users can make a decision how to configure track_timestamps_staleness. This is what the Prometheus docs say for track_timestamps_staleness:

# track_timestamps_staleness controls whether Prometheus tracks staleness of
# the metrics that have an explicit timestamps present in scraped data.
#
# If track_timestamps_staleness is set to "true", a staleness marker will be
# inserted in the TSDB when a metric is no longer present or the target
# is down.
[ track_timestamps_staleness: <boolean>  | default = false ]

I personally didn't even know there could be "metrics that have an explicit timestamps present in scraped data". It is also not clear to me what the implications of adding a staleness marker are.

Use case

Setting track_timestamps_staleness to true could fix a bug where container metrics linger for 5 minutes after a container dies.

@ptodev ptodev added the enhancement New feature or request label Dec 6, 2023
@ptodev
Copy link
Contributor Author

ptodev commented Dec 6, 2023

Note that there is also an open PR for the Agent to add a Prometheus feature which is not available in Prometheus 2.48. Maybe we could upgrade the Agent straight to Prometheus 2.49 so that we can resolve both #5921 and #5905?

@bboreham
Copy link
Contributor

cAdvisor puts a timestamp on its metrics; it does this because it sources the data asynchronously so it doesn't want the Prometheus scrape timestamp.

Adding a staleness marker just makes the series stop. This is what happens for every other metric that uses Prometheus' timestamp, when it goes away.

@ptodev
Copy link
Contributor Author

ptodev commented Dec 19, 2023

Adding a staleness marker just makes the series stop. This is what happens for every other metric that uses Prometheus' timestamp, when it goes away.

Yes, but unlike non-timestamped metrics, such a target could expose a sample with an old timestamp in a scrape or two. Wouldn't this be a problem if we had already inserted a staleness marker by then?

@bboreham
Copy link
Contributor

bboreham commented Jan 5, 2024

No, there is no problem going from {1, 2, stale} to {1, 2, 3, stale}. Or to {1, 2, stale, 3}.

But there is a knob to turn it off, in case you run into trouble. My contention is that 99% of the world wants it on, even though Prometheus v2 will default it off for compatibility reasons.

@ptodev
Copy link
Contributor Author

ptodev commented Jan 5, 2024

#5972 was merged, but we will need to raise a new PR for this, since #5972 intentionally did not expose a track_timestamps_staleness config entry.

@ptodev ptodev reopened this Jan 5, 2024
Copy link
Contributor

github-actions bot commented Feb 6, 2024

This issue has not had any activity in the past 30 days, so the needs-attention label has been added to it.
If the opened issue is a bug, check to see if a newer release fixed your issue. If it is no longer relevant, please feel free to close this issue.
The needs-attention label signals to maintainers that something has fallen through the cracks. No action is needed by you; your issue will be kept open and you do not have to respond to this comment. The label will be removed the next time this job runs if there is new activity.
Thank you for your contributions!

@github-actions github-actions bot added the needs-attention An issue or PR has been sitting around and needs attention. label Feb 6, 2024
@tpaschalis
Copy link
Member

cc @ptodev did we actually end up concluding the discussion on #5972?

Do we plan to wire in this new field, and what should the default be after all?

@ptodev
Copy link
Contributor Author

ptodev commented Feb 21, 2024

Yes, @bboreham added it in #6317. So far on the main branch, track_timestamps_staleness is set to false by default. I'll close this issue and if there is a need we could reopen it or open another one.

@ptodev ptodev closed this as completed Feb 21, 2024
@github-actions github-actions bot added the frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. label Mar 23, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 23, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request frozen-due-to-age Locked due to a period of inactivity. Please open new issues or PRs if more discussion is needed. needs-attention An issue or PR has been sitting around and needs attention.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants