-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(llmobs): submit langchain tool.invoke tool spans #10410
Conversation
|
Datadog ReportBranch report: ❌ 3 Failed (0 Known Flaky), 8351 Passed, 17030 Skipped, 1h 29m 22.5s Total Time ❌ Failed Tests (3)
|
) | ||
if tool_input: | ||
span.set_tag_str("langchain.request.input", integration.trunc(str(tool_input))) | ||
if config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this enough or should I parse the configs one by one and add them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- We don't need to truncate config as that isn't a direct I/O to the tool
- We can probably just json dump the config as it is a
TypedDict
(see source)
attrs={ | ||
# Comment : Should we have more info in the log ? | ||
"tool_name": instance.__self__.name or "", | ||
"input": tool_input, | ||
"config": config or "", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have more info in the log ?
|
||
|
||
@with_traced_module | ||
async def traced_base_tool_ainvoke(langchain, pin, func, instance, args, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a better way to trace the async variant besides duplicating the code ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately nope 😢 I think there's a couple ways to depollute the patch module (extract tracing functions to separate files, reuse common code in the sync/async functions) but we can keep this for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments apply to the async function as above
@@ -965,6 +965,170 @@ def traced_similarity_search(langchain, pin, func, instance, args, kwargs): | |||
return documents | |||
|
|||
|
|||
@with_traced_module | |||
def traced_base_tool_invoke(langchain, pin, func, instance, args, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't think of any metrics to emit from this function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about datadog metrics? If so we can ignore, that's out of scope for our team and this integration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we use any metrics from those emitted from the langchain integration ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's split out the APM integration and LLMObs integration into separate PRs for easier readability.
|
||
span = integration.trace( | ||
pin, | ||
"%s.%s.%s" % (func.__module__, func.__class__.__name__, func.__self__.__class__.__name__), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here I'd prefer to just use the tool class name instead of the entire module/path for simplicity/clarity. WDYT?
@@ -965,6 +965,170 @@ def traced_similarity_search(langchain, pin, func, instance, args, kwargs): | |||
return documents | |||
|
|||
|
|||
@with_traced_module | |||
def traced_base_tool_invoke(langchain, pin, func, instance, args, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about datadog metrics? If so we can ignore, that's out of scope for our team and this integration.
tool_output = None | ||
try: | ||
if instance.name: | ||
span.set_tag_str("langchain.request.tool.name", integration.trunc(str(instance.name))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Truncation is only ever performed on input/outputs, so no need to do it here
try: | ||
if instance.name: | ||
span.set_tag_str("langchain.request.tool.name", integration.trunc(str(instance.name))) | ||
if integration.is_pc_sampled_span(span): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pc_sampling
refers to prompt/completion sampling, or inputs/outputs in this case. We can safely tag most of the below items without needing to check is_pc_sampled_span(...)
, only need this check for input
and the return value.
|
||
|
||
@with_traced_module | ||
async def traced_base_tool_ainvoke(langchain, pin, func, instance, args, kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comments apply to the async function as above
releasenotes/notes/feat-llmobs-submit-langchain-tool-spans-d7274ae79d3808e9.yaml
Outdated
Show resolved
Hide resolved
from math import pi | ||
|
||
cassette_name = "langchain_tool_invoke.yaml" | ||
with request_vcr.use_cassette(cassette_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK tools can be called without a network connection. Why do we need a cassette for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes gotcha ! Was thinking of adding the tool to an LLM instead opted for a simple test. I did remove it in an unpushed commit. Thanks !
I still left the option to call invoke with a cassette in llmobs side for any future agent test to be made.
def _invoke_tool(cls, tool, tool_input, mock_tracer, cassette_name): | ||
LLMObs.enable(ml_app=cls.ml_app, integrations_enabled=False, _tracer=mock_tracer) | ||
with get_request_vcr(subdirectory_name=cls.cassette_subdirectory_name).use_cassette(cassette_name): | ||
if LANGCHAIN_VERSION > (0, 1): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any particular reason we're version gating this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, you'll only need to trace BaseTool.invoke() for langchain V1 and up.
In the ticket description you mentioned this so I was trying to make sure it's the case. I'm guessing since it's in langchain_community test suite it's always going to be the case ?
def _arun(self, radius: int): | ||
raise NotImplementedError("This tool does not support async") | ||
|
||
cassette_name = "langchain_tool_invoke_39.yaml" if PY39 else "langchain_tool_invoke.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, not sure if we need a cassette file for this test.
self, | ||
span: Span, | ||
tool_input: Union[str, Dict[str, Any], Any], | ||
tool_output: Any, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟠 Code Quality Violation
do not use Any, use a concrete type (...read more)
Use the Any
type very carefully. Most of the time, the Any
type is used because we do not know exactly what type is being used. If you want to specify that a value can be of any type, use object
instead of Any
.
Learn More
Co-authored-by: Yun Kim <[email protected]>
…74ae79d3808e9.yaml Co-authored-by: Yun Kim <[email protected]>
…a/trace-base-tool
# Conflicts: # ddtrace/llmobs/_integrations/langchain.py # tests/contrib/langchain/test_langchain_llmobs.py
result in a 1-span trace with a tool span. | ||
""" | ||
if langchain_core is None: | ||
pytest.skip("langchain-core not installed which is required for this test.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very small nit: I think it's ever-so-slightly preferable to use a skipif
mark in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree but the issue is that this specific dependency isn't needed on all tests and if I understand correctly, fixtures are not available on the level of skipif annotations.
So in order to have it available there i'll need to import it on the whole file instead. Which is not very optimal since it's only used on a couple tests.
Do you know of any workaround for this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know of a workaround if you want to keep using langchain_core
as a fixture, no.
My personal taste is that it's more traceable to import langchain_core
in this module and skipif
if it's absent, than to go sort out why the langchain_core
fixture is None
.
Looks like there are ~10 tests that use the langchain_core
fixture, so it looks like it's going to get loaded in just about all CI cases anyway?
Anyway, it's a very small nit. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah i completely understand your point.
I believe I'll just keep it as it is right now since I haven't authored the file before this PR and wouldn't want to change the existing structure until I speak to the original authors.
Thanks for giving it a look Romain ! 🙇
This PR adds instrumentation to submit langchain tool.invoke/ainvoke tool spans to APM and LLM Observability.
The tool spans sent to APM will contain the following I/O data and tags :
langchain.request.tool.name
langchain.request.tool.description
langchain.request.tool.metadata
langchain.request.tool.tags
langchain.request.input
langchain.request.config
langchain.response.output
The tool spans sent to LLM Observability will contain the following I/O data:
input.value: a text field annotating the tool input
output.value: The output of the tool (can be any type)
Ticket : https://datadoghq.atlassian.net/browse/MLOB-935
Checklist
Reviewer Checklist