-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add time and memory related db operation metrics #1203
base: main
Are you sure you want to change the base?
Changes from all commits
ffbf5cb
b813696
de7e7ae
7432acd
b93a472
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
# Use this changelog template to create an entry for release notes. | ||
# | ||
# If your change doesn't affect end users you should instead start | ||
# your pull request title with [chore] or use the "Skip Changelog" label. | ||
|
||
# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' | ||
change_type: enhancement | ||
|
||
# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db) | ||
component: db | ||
|
||
# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`). | ||
note: Add database operation metrics related to memory and time spent | ||
|
||
# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists. | ||
# The values here must be integers. | ||
issues: [ 723 ] | ||
|
||
# (Optional) One or more lines of additional information to render under the primary note. | ||
# These lines will be padded with 2 spaces and then inserted directly into the document. | ||
# Use pipe (|) for multiline entries. | ||
subtext: |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -33,6 +33,154 @@ groups: | |
- ref: db.namespace | ||
requirement_level: | ||
conditionally_required: If available. | ||
- ref: thread.id | ||
requirement_level: | ||
conditionally_required: If available and stable for the lifetime of the operation. | ||
|
||
- id: metric.db.client.operation.wait_time | ||
type: metric | ||
metric_name: db.client.operation.wait_time | ||
brief: "Time spent waiting by database client operations, by category (if available)." | ||
instrument: histogram | ||
unit: "s" | ||
stability: experimental | ||
extends: attributes.db.client.minimal | ||
attributes: | ||
- ref: db.collection.name | ||
requirement_level: | ||
conditionally_required: > | ||
If readily available. The collection name MAY be parsed from the query text, | ||
in which case it SHOULD be the first collection name in the query. | ||
- ref: db.system | ||
# TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 | ||
requirement_level: required | ||
- ref: network.peer.address | ||
brief: Peer address of the database node where the operation was performed. | ||
requirement_level: | ||
recommended: If applicable for this database system. | ||
note: > | ||
Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. | ||
Network peer address and port are useful when the application interacts with individual database nodes directly. | ||
|
||
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. | ||
- ref: network.peer.port | ||
requirement_level: | ||
recommended: If and only if `network.peer.address` is set. | ||
- ref: db.namespace | ||
requirement_level: | ||
conditionally_required: If available. | ||
- ref: thread.id | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit: maybe say "if available and stable for the lifetime of the operation". IF using, e.g. async IO and the handler of the response is different than the calling thread - you don't want to create new timeseries because thread.id changes. Can we make that more explicit? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good call, done |
||
requirement_level: | ||
conditionally_required: If available and stable for the lifetime of the operation. | ||
|
||
- id: metric.db.client.operation.cpu_time | ||
type: metric | ||
metric_name: db.client.operation.cpu_time | ||
brief: "CPU time spent for database client operations." | ||
instrument: histogram | ||
unit: "s" | ||
stability: experimental | ||
extends: attributes.db.client.minimal | ||
attributes: | ||
- ref: db.collection.name | ||
requirement_level: | ||
conditionally_required: > | ||
If readily available. The collection name MAY be parsed from the query text, | ||
in which case it SHOULD be the first collection name in the query. | ||
- ref: db.system | ||
# TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 | ||
requirement_level: required | ||
- ref: network.peer.address | ||
brief: Peer address of the database node where the operation was performed. | ||
requirement_level: | ||
recommended: If applicable for this database system. | ||
note: > | ||
Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. | ||
Network peer address and port are useful when the application interacts with individual database nodes directly. | ||
|
||
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. | ||
- ref: network.peer.port | ||
requirement_level: | ||
recommended: If and only if `network.peer.address` is set. | ||
- ref: db.namespace | ||
requirement_level: | ||
conditionally_required: If available. | ||
- ref: thread.id | ||
requirement_level: | ||
conditionally_required: If available and stable for the lifetime of the operation. | ||
|
||
- id: metric.db.client.operation.pages_scanned | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. since not all databases use pages, I don't think it makes sense to have this on the "general" view of the database. If there are things that only make sense for a specific group, is worth having those split from here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I definitely get the reasoning there and don't necessarily disagree, but if the concept of "pages" (or another example) is shared by multiple databases, either now or in the future, would it make more sense to have a common metric name for them or multiple db-specific names? And could it be more confusing to have some metrics have common names while others would fall under db-specific namespaces? Happy to move these metrics elsewhere if these topics have already been discussed/answered though. |
||
type: metric | ||
metric_name: db.client.operation.pages_scanned | ||
brief: "Pages scanned for database client operations." | ||
instrument: histogram | ||
unit: "{pages}" | ||
stability: experimental | ||
extends: attributes.db.client.minimal | ||
attributes: | ||
- ref: db.collection.name | ||
requirement_level: | ||
conditionally_required: > | ||
If readily available. The collection name MAY be parsed from the query text, | ||
in which case it SHOULD be the first collection name in the query. | ||
- ref: db.system | ||
# TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 | ||
requirement_level: required | ||
- ref: network.peer.address | ||
brief: Peer address of the database node where the operation was performed. | ||
requirement_level: | ||
recommended: If applicable for this database system. | ||
note: > | ||
Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. | ||
Network peer address and port are useful when the application interacts with individual database nodes directly. | ||
|
||
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. | ||
- ref: network.peer.port | ||
requirement_level: | ||
recommended: If and only if `network.peer.address` is set. | ||
- ref: db.namespace | ||
requirement_level: | ||
conditionally_required: If available. | ||
- ref: thread.id | ||
requirement_level: | ||
conditionally_required: If available and stable for the lifetime of the operation. | ||
|
||
- id: metric.db.client.operation.used_memory | ||
type: metric | ||
metric_name: db.client.operation.used_memory | ||
brief: "Memory used for database client operations." | ||
instrument: histogram | ||
unit: "By" | ||
stability: experimental | ||
extends: attributes.db.client.minimal | ||
attributes: | ||
- ref: db.collection.name | ||
requirement_level: | ||
conditionally_required: > | ||
If readily available. The collection name MAY be parsed from the query text, | ||
in which case it SHOULD be the first collection name in the query. | ||
- ref: db.system | ||
# TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 | ||
requirement_level: required | ||
- ref: network.peer.address | ||
brief: Peer address of the database node where the operation was performed. | ||
requirement_level: | ||
recommended: If applicable for this database system. | ||
note: > | ||
Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. | ||
Network peer address and port are useful when the application interacts with individual database nodes directly. | ||
|
||
If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. | ||
- ref: network.peer.port | ||
requirement_level: | ||
recommended: If and only if `network.peer.address` is set. | ||
- ref: db.namespace | ||
requirement_level: | ||
conditionally_required: If available. | ||
- ref: thread.id | ||
requirement_level: | ||
conditionally_required: If available and stable for the lifetime of the operation. | ||
|
||
- id: metric.db.client.connection.count | ||
type: metric | ||
metric_name: db.client.connection.count | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about the addition of the thread.id on all of these. Do you have a use case in mind with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My colleague mentioned the rationale in his comment here
In short, for the case of single-threaded operations, the thread id could be useful for troubleshooting slow/blocked queries, and it might also be helpful to aggregate stats on a certain thread for diagnosing performance issues. But we're open to discussion on the attribute and dropping it from the spec if the consensus is that these use cases are too niche and/or not really practical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does thread id add to the cardinality in an uncontrolled way here?