Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #49 : Use the pydantic model type in the dspy Signature #51

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

snakedye
Copy link
Contributor

This PR doesn't fully resolve #49 but rather serves as a stepping stone for the use of annotation in the Pydantic model in the future.

We now leverage the TypedChainOfThough for better integration of the pydantic schema in DSPy. It allows us to simplifies the Signature and set the Pydantic model as the output type meaning DSPy must return an object that matches the schema.

The internals of the the GPT related was also refactored to match the changes in the API.

Related to #49

@snakedye snakedye self-assigned this Oct 16, 2024
Copy link
Contributor

@Endlessflow Endlessflow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING about the observability code added that is related to Phoenix:
Tread with caution as it will break the pipeline if the endpoint given is not valid (it will keep trying to send the traces in an infinite loop).

Also, we forgot to remove everything related to the performance assessment from this PR. Therefore, all the changes in to the following files need to be removed from the PR:

  • .gitignore
  • performance_assessment.py
  • test_data/*

Otherwise, looks good.

)

DSPyInstrumentor().instrument(tracer_provider=tracer_provider)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend exercising caution when adding lines 8-17 to the main branch. If the endpoint isn’t valid (i.e., Phoenix isn’t set up), it WILL cause the entire pipeline to stop functioning (I had that happen to me this morning).

To mitigate this, I suggest either removing the observability code for now or commenting it out by default. We can uncomment it when we need to use traces for debugging.

Additionally, I’m unsure if this feature should have its own issue and PR, or if it’s acceptable to include it in this PR.

test_logs/
reports/
test_data/
performance_assessment.py
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think adding test_data/ and performance_assessment.py into the gitignore was intended (nor desirable).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We forgot to remove this from this PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, test_data stuff should not be merged into main yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idem - should not be merged into main yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline
3 participants