Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opentelemetry python sdk does not work well under fork #4215

Open
yurneroma opened this issue Oct 9, 2024 · 0 comments
Open

opentelemetry python sdk does not work well under fork #4215

yurneroma opened this issue Oct 9, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@yurneroma
Copy link

Describe your environment

OS: (Ubuntu 22.04)
Python version: (Python 3.12)
SDK version: (1.26.0)
API version: (1.26.0)

What happened?

i wrote the code below:

import os
from opentelemetry import  trace

from opentelemetry.exporter.otlp.proto.http.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, SimpleSpanProcessor
from functools import wraps
from multiprocessing import Process
import multiprocessing as mp

def init_tracer():
    resource = Resource.create(
        attributes={
            "service.name": "api-service",
            # If workers are not distinguished within attributes, traces and
            # metrics exported from each worker will be indistinguishable. While
            # not necessarily an issue for traces, it is confusing for almost
            # all metric types. A built-in way to identify a worker is by PID
            # but this may lead to high label cardinality. An alternative
            # workaround and additional discussion are available here:
            # https://github.com/benoitc/gunicorn/issues/1352
            "worker": os.getpid(),
        }
    )

    trace.set_tracer_provider(TracerProvider(resource=resource))
    # This uses insecure connection for the purpose of example. Please see the
    # OTLP Exporter documentation for other options.
    span_processor = BatchSpanProcessor(
            OTLPSpanExporter(endpoint="http://tempo.mycompany.cn:4318/v1/traces")
    )

    trace.get_tracer_provider().add_span_processor(span_processor)


# def post_fork(func):
#     @wraps(func)
#     def wrapper(*args, **kwargs):
#         init_tracer()
#         res = func(*args, **kwargs)
#         return res
#     return wrapper

#@post_fork
import time
def worker_func(name):
    with trace.get_tracer(__name__).start_as_current_span('multi-span') as span:
        span.set_attribute("pid", os.getpid())
        print(f"worker_func {name} running, span context : {trace.get_current_span().get_span_context()}")




if __name__ == '__main__':
    init_tracer()
    print('----------------------')

    mp.set_start_method('fork')
    p = Process(target=worker_func, args=('bob',))
    p.start()
    print("main process will wait child process")


 

i want to see the trace which is generated in the child process to be exported to the tempo. but it does not.

Steps to Reproduce

import os
from opentelemetry import  trace

from opentelemetry.exporter.otlp.proto.http.trace_exporter import (
    OTLPSpanExporter,
)
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor, SimpleSpanProcessor
from functools import wraps
from multiprocessing import Process
import multiprocessing as mp

def init_tracer():
    resource = Resource.create(
        attributes={
            "service.name": "api-service",
            # If workers are not distinguished within attributes, traces and
            # metrics exported from each worker will be indistinguishable. While
            # not necessarily an issue for traces, it is confusing for almost
            # all metric types. A built-in way to identify a worker is by PID
            # but this may lead to high label cardinality. An alternative
            # workaround and additional discussion are available here:
            # https://github.com/benoitc/gunicorn/issues/1352
            "worker": os.getpid(),
        }
    )

    trace.set_tracer_provider(TracerProvider(resource=resource))
    # This uses insecure connection for the purpose of example. Please see the
    # OTLP Exporter documentation for other options.
    span_processor = BatchSpanProcessor(
            OTLPSpanExporter(endpoint="http://tempo.mycopany.ac.cn:4318/v1/traces")
    )

    trace.get_tracer_provider().add_span_processor(span_processor)


# def post_fork(func):
#     @wraps(func)
#     def wrapper(*args, **kwargs):
#         init_tracer()
#         res = func(*args, **kwargs)
#         return res
#     return wrapper

#@post_fork
import time
def worker_func(name):
    with trace.get_tracer(__name__).start_as_current_span('multi-span') as span:
        span.set_attribute("pid", os.getpid())
        print(f"worker_func {name} running, span context : {trace.get_current_span().get_span_context()}")




if __name__ == '__main__':
    init_tracer()
    print('----------------------')

    mp.set_start_method('fork')
    p = Process(target=worker_func, args=('bob',))
    p.start()
    print("main process will wait child process")


    
 

Expected Result

i want to see the trace which is generated in the child process to be exported to the tempo.

and could you offer some function to reinitialize the TracerProvider object after i fork a process, and i can use it as a total new object. and the all of state is right.

Actual Result

the readable span info do not be exported in the BatchSpanProcessor.worker() function.

Additional context

  • i init the TracerProvider in the main process.
  • and i fork a process, and execute some logic.
  • in my worker_func, i generate a span.
  • i add some log in the opentelemetry-sdk lib.
  • it seems that when my child process exit, the TracerProvider object in child process does not call the shutdown function, so the SynchronousMultiSpanProcessor.shutdown() can not be called, so the BatchSpanProcessor.worker() can not be waited. when i span info be send to queue, and just then, when the worker wait the condition(timeout), the worker thread killed when the children process exit.
image

could you offer some function to reinitialize the TracerProvider object after i fork a process, and i can use it as a total new object. and the all of state is right.

Would you like to implement a fix?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant