-
Notifications
You must be signed in to change notification settings - Fork 581
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add Initial Support for Instrumenting OpenAI Python Library - Chat Completion Create #2759
base: main
Are you sure you want to change the base?
[WIP] Add Initial Support for Instrumenting OpenAI Python Library - Chat Completion Create #2759
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a first round of comments, thanks!
instrumentation/opentelemetry-instrumentation-openai/README.rst
Outdated
Show resolved
Hide resolved
instrumentation/opentelemetry-instrumentation-openai/pyproject.toml
Outdated
Show resolved
Hide resolved
build-backend = "hatchling.build" | ||
|
||
[project] | ||
name = "opentelemetry-instrumentation-openai" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately this is already taken and we have to sort out how to handle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could be really tricky, do we have flexibility regarding naming or should we be following a certain convention for naming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The convention for naming would suggest this name unfortunately 😓
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest opentelemetry-instrumentation-openai-v2
and the following plan:
- we'll start with this name and evolve it along with semconv to have feature parity with
opentelemetry-instrumentation-openai
- we'll need to check with @nirga, but based on the previous discussions they should, at some point, be able to publish
opentelemetry-instrumentation-openai
package from OTel - when Traceloop and we are ready, we'll start publishing
opentelemetry-instrumentation-openai
v2 from OTel and stop publishingopentelemetry-instrumentation-openai-v2
.
This would let us unblock this work and will give traceloop time to migrate. All the details need further discussion (and of course name I came up with is almost random - any other suggestions are welcome).
...on/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/__init__.py
Outdated
Show resolved
Hide resolved
from pydantic import BaseModel, ConfigDict, Field | ||
|
||
|
||
class SpanAttributes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should use the attributes from the semantic conventions package instead of duplicating them here, see https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-semantic-conventions/src/opentelemetry/semconv/_incubating/attributes/gen_ai_attributes.py
It's fine to add here the missing ones
...on/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/__init__.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
@@ -0,0 +1,8 @@ | |||
openai==1.37.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to add all the openai dependencies to keep things reproducible
RESPONSE = "response" | ||
|
||
|
||
class LLMSpanAttributes(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please elaborate a bit on why we need pydantic for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's mainly an extra layer of validation, so on patch level what happens is the following
- prepare an object containing span attributes needed
- validate that required fields are present for ex
gen_ai.operation.name
&gen_ai.request.model
- save those attributes upon successful validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we validate it with tests? If something fails in runtime our validation would not really help anyone.
...ion/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/version.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
…ntrib into openai-opentelemetry
|
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Show resolved
Hide resolved
): | ||
set_span_attribute( | ||
span, | ||
SpanAttributes.LLM_SYSTEM_FINGERPRINT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not in semconv, could you please send the PR to add it there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
def set_span_attributes(span: Span, attributes: dict): | ||
for field, value in attributes.model_dump(by_alias=True).items(): | ||
set_span_attribute(span, field, value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which attributes it will produce? what would be their names?
we should record attributes documented in the spec https://github.com/open-telemetry/semantic-conventions/blob/v1.27.0/docs/gen-ai/gen-ai-spans.md#genai-attributes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lmolkova These are coming from LLMSpanAttributes feed with SpanAttributes from https://github.com/open-telemetry/opentelemetry-python-contrib/pull/2759/files#diff-0042eaf389fe22a3c3045eda672af6ece462798488c3c3712e99b0727a114f45R21
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
span.add_event( | ||
name=SpanAttributes.LLM_CONTENT_PROMPT, | ||
attributes={ | ||
SpanAttributes.LLM_PROMPTS: prompt, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be an event
if usage is not None: | ||
set_span_attribute( | ||
span, | ||
SpanAttributes.LLM_USAGE_PROMPT_TOKENS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SpanAttributes.LLM_USAGE_PROMPT_TOKENS
is deprecated both in the spec and in the semconv package.
We might need to update to the latest semconv package version to use all new attributes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start, thanks a lot for doing this!
1. clean up prompt calculations to be deduced from last streaming chunk 2. save correct span name 3. remove recording exceptions and setting status to ok 4. remove saving stream chunks in events
instrumentation/opentelemetry-instrumentation-openai/pyproject.toml
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/utils.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/utils.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/utils.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/utils.py
Outdated
Show resolved
Hide resolved
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/utils.py
Outdated
Show resolved
Hide resolved
…ween external and otel instrumentations (#4187) Some package managers (PyPi) don't provide means to reserve namespaces for projects. We also have a number of **external** instrumentation libraries in python that follow current guidance and use `opentelemetry-instrumentation-{component}` naming pattern. These libraries are hard to distinguish from otel-authored ones. Also, when someone (legitimately following existing guidance) creates an external instrumentation package like this, it blocks our ability to have OTel-authored instrumentation with this 'good' name. See open-telemetry/opentelemetry-python-contrib#2759 (comment) for real-life example. ## Changes This PR changes the recommendation to: - otel authored instrumentation should use `opentelemetry-instrumentation-*` pattern - external instrumentation should not use this pattern and should prefix lib name with their company/project/etc name * ~~[ ] Related issues #~~ * ~~[ ] Related [OTEP(s)](https://github.com/open-telemetry/oteps) #~~ * ~~[ ] Links to the prototypes (when adding or changing features)~~ * [x] [`CHANGELOG.md`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/CHANGELOG.md) file updated for non-trivial changes * ~~[ ] [`spec-compliance-matrix.md`](https://github.com/open-telemetry/opentelemetry-specification/blob/main/spec-compliance-matrix.md) updated if necessary~~ --------- Co-authored-by: Armin Ruech <[email protected]>
…emetry-python-contrib into openai-opentelemetry
...on/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/__init__.py
Outdated
Show resolved
Hide resolved
def _instrument(self, **kwargs): | ||
"""Enable OpenAI instrumentation.""" | ||
tracer_provider = kwargs.get("tracer_provider") | ||
tracer = get_tracer(__name__, "", tracer_provider) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
tracer = get_tracer(__name__, "", tracer_provider) | |
tracer = get_tracer(__name__, "", tracer_provider, schema_url=Schemas.V1_27_0) |
span = tracer.start_span( | ||
name=span_name, | ||
kind=SpanKind.CLIENT, | ||
context=set_span_in_context(trace.get_current_span()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there should be no need, this is the default behavior
context=set_span_in_context(trace.get_current_span()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree, will remove
|
||
except Exception as error: | ||
span.set_status(Status(StatusCode.ERROR, str(error))) | ||
span.set_attribute("error.type", error.__class__.__name__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from opentelemetry.semconv.attributes import error_attributes
span.set_attribute("error.type", error.__class__.__name__) | |
span.set_attribute(error_attributes.ERROR_TYPE, type(error).__qualname__) |
error.type
should be fully qualified
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree will fix
set_span_attribute( | ||
span, GenAIAttributes.GEN_AI_RESPONSE_MODEL, result.model | ||
) | ||
print(result) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops
if choice.finish_reason: | ||
set_span_attribute( | ||
span, | ||
GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be an array of all reasons returned
finish_reasons = []
for choice in choices:
finish_reasons.append(choice.finish_reason or "error")
set_span_attribute(span, GenAIAttributes.GEN_AI_RESPONSE_FINISH_REASONS, finish_reasons)
or kwargs.get("k") | ||
or kwargs.get("top_k") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
openai does not have top_k
or k
parameters, do we need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope, will be removed
else None | ||
) | ||
top_k = ( | ||
kwargs.get("n") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
n
represents number of choices, while Claude has top_k
which is " Only sample from the top K options for each subsequent token." - they don't seem to be related.
GenAIAttributes.GEN_AI_SYSTEM: GenAIAttributes.GenAiSystemValues.OPENAI.value, | ||
GenAIAttributes.GEN_AI_REQUEST_MODEL: model or kwargs.get("model"), | ||
GenAIAttributes.GEN_AI_REQUEST_TEMPERATURE: kwargs.get("temperature"), | ||
GenAIAttributes.GEN_AI_REQUEST_TOP_K: top_k, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my previous comments about top_k
GenAIAttributes.GEN_AI_REQUEST_TOP_K: top_k, |
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
__version__ = "0.48b0.dev" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be
__version__ = "0.48b0.dev" | |
__version__ = "0.49b0.dev" |
now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we really need some tests :) happy to help with additional test cases if you can create the first happy case test with the tool you mentioned that uses recorded content.
...ation/opentelemetry-instrumentation-openai/src/opentelemetry/instrumentation/openai/patch.py
Outdated
Show resolved
Hide resolved
6b453e1
to
452d41a
Compare
__name__, | ||
"", | ||
tracer_provider, | ||
schema_url="https://opentelemetry.io/schemas/1.27.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not
from opentelemetry.semconv.schemas import Schemas
...
schema_url=Schemas.V1_27_0
?
if exc_type is not None: | ||
self.span.set_status(Status(StatusCode.ERROR, str(exc_val))) | ||
self.span.set_attribute( | ||
ErrorAttributes.ERROR_TYPE, exc_type.__name__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ErrorAttributes.ERROR_TYPE, exc_type.__name__ | |
ErrorAttributes.ERROR_TYPE, exc_type.__qualname__ |
def __exit__(self, exc_type, exc_val, exc_tb): | ||
try: | ||
if exc_type is not None: | ||
self.span.set_status(Status(StatusCode.ERROR, str(exc_val))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider moving error status reporting to a helper method
RESPONSE = "response" | ||
|
||
|
||
class LLMSpanAttributes(BaseModel): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we validate it with tests? If something fails in runtime our validation would not really help anyone.
) | ||
gen_ai_usage_total_tokens: Optional[float] = Field( | ||
None, | ||
alias="gen_ai.usage.total_tokens", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove unused attributes
@karthikscale3 This is great! thanks for starting this! We are interested in using this as well. How can we prioritize it to be approved soon? Also are there any documentation/ readme with same example & steps on how users can use this |
Description
This PR adds support for tracing the official python openai library.
This pull request introduces initial support for instrumenting the OpenAI Python library, specifically targeting the chat.completion.create method. This implementation aligns with the GenAI semantic conventions.
Fixes #1796
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
This is a work in progress PR at the moment and I plan to add unit tests and update this space.
Does This PR Require a Core Repo Change?
Checklist:
See contributing.md for styleguide, changelog guidelines, and more.