As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline #49

k-allagbe · 2024-10-09T18:17:48Z

Description

Context
Currently, we pass our Pydantic model's JSON schema to GPT, like this:

json_schema = FertilizerInspection.model_json_schema()
signature = dspy.ChainOfThought(ProduceLabelForm)
prediction = signature(text=text, json_schema=json_schema, requirements=REQUIREMENTS)

An example output of model_json_schema():

from pydantic import BaseModel

class Address(BaseModel):
    street: str

class User(BaseModel):
    id: int
    email: str | None = None
    address: Address

print(User.model_json_schema())

Results:

{
  '$defs': {
    'Address': {
      'properties': {'street': {'title': 'Street', 'type': 'string'}},
      'required': ['street'],
      'title': 'Address',
      'type': 'object'
    }
  },
  'properties': {
    'id': {'title': 'Id', 'type': 'integer'},
    'email': {
      'anyOf': [{'type': 'string'}, {'type': 'null'}],
      'default': None,
      'title': 'Email'
    },
    'address': {'$ref': '#/$defs/Address'}
  },
  'required': ['id', 'address'],
  'title': 'User',
  'type': 'object'
}

Problem Statement
I want to investigate whether annotating the Pydantic model with additional metadata (like descriptions and examples) could improve GPT's performance and the accuracy of predictions in our pipeline.

For instance, here's how we can annotate the same model:

from pydantic import BaseModel, Field

class Address(BaseModel):
    street: str = Field(..., description="Street address of the user", example="123 Main St")

class User(BaseModel):
    id: int = Field(..., description="User's unique identifier", example=1)
    email: str | None = Field(None, description="User's email address, optional", example="email@somewhere")
    address: Address = Field(..., description="Address details of the user")

print(User.model_json_schema())

Results with annotations:

{
  '$defs': {
    'Address': {
      'properties': {
        'street': {
          'description': 'Street address of the user',
          'example': '123 Main St',
          'title': 'Street',
          'type': 'string'
        }
      },
      'required': ['street'],
      'title': 'Address',
      'type': 'object'
    }
  },
  'properties': {
    'id': {
      'description': "User's unique identifier",
      'example': 1,
      'title': 'Id',
      'type': 'integer'
    },
    'email': {
      'anyOf': [{'type': 'string'}, {'type': 'null'}],
      'default': None,
      'description': "User's email address, optional",
      'example': 'email@somewhere',
      'title': 'Email'
    },
    'address': {
      'allOf': [{'$ref': '#/$defs/Address'}],
      'description': 'Address details of the user'
    }
  },
  'required': ['id', 'address'],
  'title': 'User',
  'type': 'object'
}

Acceptance Criteria

Research the effectiveness of annotated Pydantic models when passed to GPT for predictions.
Measure any performance improvements (e.g., more accurate responses, better contextual understanding).
If improvements are found, quantify by how much and document the changes.

Additional Information

Consider testing with different levels of model complexity and annotation depth.

The text was updated successfully, but these errors were encountered:

snakedye assigned snakedye and unassigned snakedye Oct 15, 2024

snakedye linked a pull request Oct 16, 2024 that will close this issue

Issue #49 : Use the pydantic model type in the dspy Signature #51

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline #49

As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline #49

k-allagbe commented Oct 9, 2024 •

edited

Loading

As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline #49

As a dev, I want to explore if annotating Pydantic models can improve GPT performance in our pipeline #49

Comments

k-allagbe commented Oct 9, 2024 • edited Loading

Description

Acceptance Criteria

Additional Information

k-allagbe commented Oct 9, 2024 •

edited

Loading