Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenTelemetry OTLP setup for tracing, take 2 #697

Open
wants to merge 16 commits into
base: main
Choose a base branch
from

Conversation

mzabaluev
Copy link
Collaborator

@mzabaluev mzabaluev commented Oct 15, 2024

Resolves: #555 #556 #559

Summary

  • Categories: protocol-units, networks, util.

Integrate an OTLP exporter for tracing events matching the "movement_tracing" target.
The OTLP gRPC endpoint is configured with the MOVEMENT_OTLP environment variable.

This replaces the earlier "movement_timing" tracing layer, as analysis of OpenTelemetry data should be more versatile.

Changelog

  • Added OpenTelemetry OTLP exporter, with the endpoint URL configured with the MOVEMENT_OTLP environment variable.
  • Removed support for timing JSON logs configured with MOVEMENT_TIMING.

Testing

The telemetry overlay for process-compose runs the jaeger all-in-one Docker container as a local OpenTelemetry collector.

Unresolved issues

Add docker-compose setup for jaeger or similar.

Replace the "movement_timing" tracing target and the logging layer
it targeted with an optionally installed OpenTelemetry OTLP exporter.
The name of the tracing target matched to send OpenTelemetry events is
"movement_telemetry".
Add the telemetry overlay enabling OTLP telemetry
export in suzuka-full-node and m1-da-light-node..

In the telemetry overlay for suzuka-full-node, add
an OTLP collector start job running a docker container.
Provide telemetry API as separate from tracing rather than
a globally installed layer. Installing an OpenTelemetry layer into the
global tracing subscriber raises nasty reentrancy issues because
the OTLP exporter stack also uses tracing under the hood.
Install an OpenTelemetryLayer configured with the OTLP exporter.
The tracing spans and events to export are selected by target
"movement_telemetry".
The implementation of shutdown in the opentelemetry_sdk exporter
calls futures_executor::block_on, which does not play well with
the multithreaded Tokio runtime.
@mzabaluev mzabaluev force-pushed the mikhail/opentelemetry-through-tracing branch from 3a27069 to f713d71 Compare October 16, 2024 12:56
OpenTelemetry needs spans at the top level of
its log event model at least.
Emit telemetry events detailing the success or failure
In the transaction_ingress task of suzuka-full-node
and executor's transaction_pipe, add comments detailing
which metrics the telemetry events are contributing to.
At the points where transaction is dropped in the submit_transaction
method, emit telemetry events. These will help compute the
transaction failure rate.
util/tracing/src/tracing.rs Show resolved Hide resolved
environment:

processes:
otlp-collector:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't seem to be getting anything served by this and the e2e tests for the basic simulation are failing. Is this how I should be using this?:

just suzuka-full-node native build.setup.eth-local.celestia-local.test.telemetry --keep-project

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2024-10-18 at 10 49 29 AM

This is all I'm seeing from the collector process when I start with the above.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a feed overlay in case for some reason this is not in fact staying open with the --keep-project flag.

just suzuka-full-node native build.setup.eth-local.celestia-local.feed.telemetry --keep-project

@l-monninger
Copy link
Collaborator

@mzabaluev Last commit adds a test. It would be good to add to CI if you like it.

@mzabaluev
Copy link
Collaborator Author

@mzabaluev Last commit adds a test. It would be good to add to CI if you like it.

I have added the overlays to the local test job, as we need some tests to drive the exporting. Do you think it would be better to test separately?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emit OpenTelepathy events for ingress transactions
2 participants