Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Das client using chunks metric #2599

Closed
wants to merge 5 commits into from

Conversation

Tristan-Wilson
Copy link
Member

This PR adds a metric on the batch poster side to indicate which DAS backends are using the chunked store method and which are using the legacy method.

The metrics that are added are:

arb_das_rpcclient_<sanitized backend hostname>_legacystore_total
arb_das_rpcclient_<sanitized backend hostname>_chunkedstore_total

@cla-bot cla-bot bot added the s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA. label Aug 21, 2024
@Tristan-Wilson
Copy link
Member Author

Fixes NIT-2742

gligneul
gligneul previously approved these changes Aug 21, 2024
Copy link
Contributor

@gligneul gligneul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Does this PR replace #2549?

Copy link
Contributor

@ganeshvanahalli ganeshvanahalli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Tristan-Wilson Tristan-Wilson changed the base branch from master to das-client-metrics August 22, 2024 20:39
@Tristan-Wilson
Copy link
Member Author

LGTM! Does this PR replace #2549?

Oops, I had meant for this PR to be based on that PR, changed the base now. Thanks.

Base automatically changed from das-client-metrics to master September 30, 2024 22:26
@tsahee tsahee dismissed stale reviews from ganeshvanahalli and gligneul September 30, 2024 22:26

The base branch was changed.

Copy link
Contributor

@tsahee tsahee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't review yet. Marking as approved because it has previous approvals that were dismissed when it's parent changed. That way it should come up in my searches for approved PRs

@Tristan-Wilson
Copy link
Member Author

In promql you can make a query to discover and show the rates for the metrics with different names like this:

rate(
  label_replace(
    {__name__=~"arb_das_rpcclient_.*_legacystore_total", chain="nova"},
    "metric_name", "$0", "__name__", "(.*)"
  )[${__rate_interval}:15s]
)
  • label_replace(...): Adds a metric_name label to each time series, which is required by rate which needs the metrics to have differing label sets (__name__ does not count as a label in this case).
  • [${__rate_interval}:15s]: The subquery notation that creates a range vector from the result of label_replace(). rate(...) requires a range vector as its input, which label_replace does not provide by default.

Copy link
Collaborator

@PlasmaPower PlasmaPower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment but generally LGTM

url: target,
signer: signer,
chunkSize: uint64(chunkSize),
metricName: metricsutil.CanonicalizeMetricName(url.Hostname()),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like CanonicalizeMetricName doesn't filter out periods, which don't seem to be an allowed Prometheus metric name but are of course common in URLs.

Copy link
Member Author

@Tristan-Wilson Tristan-Wilson Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it does replace them. The regex is picking characters that don't match the range, which doesn't include ., and replacing them with _.

// Prometheus metric names must contain only chars [a-zA-Z0-9:_]
func CanonicalizeMetricName(metric string) string {
	invalidPromCharRegex := regexp.MustCompile(`[^a-zA-Z0-9:_]+`)
	return invalidPromCharRegex.ReplaceAllString(metric, "_")

}

If you look at the Nova dashboard, first panel, you can see the dots in domains have been replaced with _.

Copy link
Member

@joshuacolvin0 joshuacolvin0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prometheus metrics are really for things that should be tracked long term, but legacy/chunked is just a transition between new and old, so better to just emit an info log when using legacy (no log when doing chunked)

@Tristan-Wilson
Copy link
Member Author

Will redo as a new PR to just log when we are using the legacy method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
design-approved s Automatically added by the CLA bot if the creator of a PR is registered as having signed the CLA.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants