Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(llmobs): recreate writer on fork #10249

Merged
merged 4 commits into from
Aug 22, 2024
Merged

Conversation

Yun-Kim
Copy link
Contributor

@Yun-Kim Yun-Kim commented Aug 16, 2024

This PR ensures that the LLMObsSpanWriter is correctly recreated and restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the writer worker correctly. This resulted in situations with using celery/gunicorn where LLMObs spans were created, but never submitted as the forked worker process did not have a running writer thread.

Checklist

  • PR author has checked that all the criteria below are met
  • The PR description includes an overview of the change
  • The PR description articulates the motivation for the change
  • The change includes tests OR the PR description describes a testing strategy
  • The PR description notes risks associated with the change, if any
  • Newly-added code is easy to change
  • The change follows the library release note guidelines
  • The change includes or references documentation updates if necessary
  • Backport labels are set (if applicable)

Reviewer Checklist

  • Reviewer has checked that all the criteria below are met
  • Title is accurate
  • All changes are related to the pull request's stated goal
  • Avoids breaking API changes
  • Testing strategy adequately addresses listed risks
  • Newly-added code is easy to change
  • Release note makes sense to a user of the library
  • If necessary, author has acknowledged and discussed the performance implications of this PR as reported in the benchmarks PR comment
  • Backport labels are set in a manner that is consistent with the release branch maintenance policy

Copy link
Contributor

github-actions bot commented Aug 16, 2024

CODEOWNERS have been resolved as:

releasenotes/notes/fix-llmobs-forked-writer-257b993bcf131af8.yaml       @DataDog/apm-python
ddtrace/llmobs/__init__.py                                              @DataDog/ml-observability
ddtrace/llmobs/_llmobs.py                                               @DataDog/ml-observability
ddtrace/llmobs/_writer.py                                               @DataDog/ml-observability
tests/llmobs/test_llmobs_service.py                                     @DataDog/ml-observability

@datadog-dd-trace-py-rkomorn
Copy link

datadog-dd-trace-py-rkomorn bot commented Aug 16, 2024

Datadog Report

Branch report: yunkim/llmobs-fix-fork-writer
Commit report: ef8ddcd
Test service: dd-trace-py

✅ 0 Failed, 2624 Passed, 38250 Skipped, 28m 14.8s Total Time

@pr-commenter
Copy link

pr-commenter bot commented Aug 16, 2024

Benchmarks

Benchmark execution time: 2024-08-16 22:27:28

Comparing candidate commit 2fffeb3 in PR branch yunkim/llmobs-fix-fork-writer with baseline commit fc1209e in branch main.

Found 0 performance improvements and 0 performance regressions! Performance is the same for 353 metrics, 47 unstable metrics.

@P403n1x87
Copy link
Contributor

P403n1x87 commented Aug 16, 2024

May I suggest a change similar to #10247 to automatically recreate the periodic thread on fork (usage example in #10251)?

@Yun-Kim Yun-Kim marked this pull request as ready for review August 16, 2024 21:47
@Yun-Kim Yun-Kim requested review from a team as code owners August 16, 2024 21:47
@Yun-Kim
Copy link
Contributor Author

Yun-Kim commented Aug 16, 2024

May I suggest a change similar to #10247 to automatically recreate the periodic thread on fork (usage example in #10251)?

Hi @P403n1x87, I have a couple questions about Awakeable periodic services (and what that would mean for LLMObsSpanWriter in this case). This is a bit of an urgent fix so I'd rather move forward with this for now, but I'll 100% follow up with you about this and am happy to make the refactor to use your utility class.

@sabrenner sabrenner enabled auto-merge (squash) August 22, 2024 13:56
@sabrenner sabrenner merged commit 64b5804 into main Aug 22, 2024
153 of 155 checks passed
@sabrenner sabrenner deleted the yunkim/llmobs-fix-fork-writer branch August 22, 2024 14:01
Copy link
Contributor

The backport to 2.10 failed:

The process '/usr/bin/git' failed with exit code 1

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-2.10 2.10
# Navigate to the new working tree
cd .worktrees/backport-2.10
# Create a new branch
git switch --create backport-10249-to-2.10
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 64b58044cf94110e7ddf97b656575b6707ee2a3f
# Push it to GitHub
git push --set-upstream origin backport-10249-to-2.10
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-2.10

Then, create a pull request where the base branch is 2.10 and the compare/head branch is backport-10249-to-2.10.

github-actions bot pushed a commit that referenced this pull request Aug 22, 2024
This PR ensures that the LLMObsSpanWriter is correctly recreated and
restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the
writer worker correctly. This resulted in situations with using
celery/gunicorn where LLMObs spans were created, but never submitted as
the forked worker process did not have a running writer thread.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Sam Brenner <[email protected]>
(cherry picked from commit 64b5804)
github-actions bot pushed a commit that referenced this pull request Aug 22, 2024
This PR ensures that the LLMObsSpanWriter is correctly recreated and
restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the
writer worker correctly. This resulted in situations with using
celery/gunicorn where LLMObs spans were created, but never submitted as
the forked worker process did not have a running writer thread.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: Sam Brenner <[email protected]>
(cherry picked from commit 64b5804)
sabrenner pushed a commit that referenced this pull request Aug 22, 2024
Backport 64b5804 from #10249 to 2.11.

This PR ensures that the LLMObsSpanWriter is correctly recreated and
restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the
writer worker correctly. This resulted in situations with using
celery/gunicorn where LLMObs spans were created, but never submitted as
the forked worker process did not have a running writer thread.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Yun Kim <[email protected]>
emmettbutler pushed a commit that referenced this pull request Aug 22, 2024
Backport 64b5804 from #10249 to 2.12.

This PR ensures that the LLMObsSpanWriter is correctly recreated and
restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the
writer worker correctly. This resulted in situations with using
celery/gunicorn where LLMObs spans were created, but never submitted as
the forked worker process did not have a running writer thread.

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

Co-authored-by: Yun Kim <[email protected]>
lievan added a commit that referenced this pull request Sep 23, 2024
This PR ensures that the `LLMObsEvalMetricWriter` is correctly recreated
and restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the eval
metric writer worker.

Mirrors #10249 but for eval
metric writer

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: lievan <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
github-actions bot pushed a commit that referenced this pull request Sep 23, 2024
This PR ensures that the `LLMObsEvalMetricWriter` is correctly recreated
and restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the eval
metric writer worker.

Mirrors #10249 but for eval
metric writer

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: lievan <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 5dbd7ef)
github-actions bot pushed a commit that referenced this pull request Sep 23, 2024
This PR ensures that the `LLMObsEvalMetricWriter` is correctly recreated
and restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the eval
metric writer worker.

Mirrors #10249 but for eval
metric writer

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: lievan <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
(cherry picked from commit 5dbd7ef)
mabdinur pushed a commit that referenced this pull request Sep 25, 2024
This PR ensures that the `LLMObsEvalMetricWriter` is correctly recreated
and restarted on a forked process.

Previously, on a process fork we were not recreating/restarting the eval
metric writer worker.

Mirrors #10249 but for eval
metric writer

## Checklist
- [x] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))

## Reviewer Checklist
- [x] Reviewer has checked that all the criteria below are met 
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)

---------

Co-authored-by: lievan <[email protected]>
Co-authored-by: Yun Kim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants