Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native SDK support for Structured Outputs #508

Open
kirby-jobber opened this issue Aug 6, 2024 · 9 comments
Open

Native SDK support for Structured Outputs #508

kirby-jobber opened this issue Aug 6, 2024 · 9 comments

Comments

@kirby-jobber
Copy link

The Problem

We all hate trying to coax ChatGPT into adhering to a JSON schema. OpenAI decided to make that easier for us.

The flow:

  • Declare the data transfer objects (DTOs) that you want from the model.
  • Demand the response exactly in your format.
  • Parse the response with code instead of godawful regex/string extractor

It would be really nice to have native support within this ruby gem!

Prior Art

In python, a nice example with math Q&A

from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: list[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    print(message.parsed.steps)
    print(message.parsed.final_answer)
else:
    print(message.refusal)

Potential Solve

Not sure what the ideal pydantic replacement would be, but perhaps dry-struct? DTO declarations could look like this:

require 'dry-struct'
require 'dry-types'

module Types
  include Dry.Types()
end

class Step < Dry::Struct
  attribute :explanation, Types::String
  attribute :output, Types::String
end

class MathResponse < Dry::Struct
  attribute :steps, Types::Array.of(Step)
  attribute :final_answer, Types::String
end

(Not sure if the gem should handle any schema validations, since that's purportedly OpenAI's job, but there's dry-validations if so.)

The rest of OpenAI's math tutor example might look like

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor."},
      { role: "user", content: "solve 8x + 31 = 2"},
    ],
    response_format: MathResponse
  }
)

message = completion.dig("choices", 0, "message")
if message.parsed
  message.parsed.steps.each do |step|
    puts step.explanation
    puts step.output
  end
  puts message.parsed.final_answer
else
  puts message.refusal
end
@bastos
Copy link

bastos commented Aug 7, 2024

I like dry-rb and use it daily, but I think it should be bare-bones and let the developer create abstractions on top of it using dry-rb, Sorbet, Active Record etc. At least make passing and returning dry-rb objects optional.

Example:

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor." },
      { role: "user", content: "solve 8x + 31 = 2" },
    ],
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "math_response",
        strict: true,
        schema: {
          type: "object",
          properties: {
            steps: {
              type: "array",
              items: {
                type: "object",
                properties: {
                  explanation: {
                    type: "string",
                  },
                  output: {
                    type: "string",
                  },
                },
                required: ["explanation", "output"],
                additionalProperties: false,
              },
            },
            final_answer: {
              type: "string",
            },
          },
          required: ["steps", "final_answer"],
          additionalProperties: false,
        },
      },
    },
  }
)

And if response_format is passed, the library should use JSON.parse on the output.

@jeremedia
Copy link

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

@bastos
Copy link

bastos commented Aug 8, 2024

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

And the Node official library has support for Zod. So, making an argument against my previous comment, maybe it should support dry's Dry::Schema.JSON (optionally, I hope).

@jeremedia
Copy link

jeremedia commented Aug 15, 2024

Here's a ruby script implementing StructuredOutputs::Schema that provides the functionality OpenAI added to their python SDK for super-simple structured output definitions. Replicates their cookbook example perfectly.

Key thing is this simplicity:

class MathReasoning < StructuredOutputs::Schema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

schema = MathReasoning.new
  
result = client.parse(
  model: "gpt-4o-2024-08-06",
  messages: [
    { role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
    { role: "user", content: "how can I solve 8x + 7 = -23" }
  ],
  response_format: schema
)

To get this:

{
  "steps": [
    {
      "expression": "8x + 7 = -23",
      "explanation": "This is your starting equation."
    },
    {
      "expression": "8x = -23 - 7",
      "explanation": "Subtract 7 from both sides to isolate the term with 'x' on the left side."
    },
    {
      "expression": "8x = -30",
      "explanation": "Simplify the right side by calculating -23 - 7, which is -30."
    },
    {
      "expression": "x = -30 / 8",
      "explanation": "Divide both sides by 8 to solve for 'x'."
    },
    {
      "expression": "x = -3.75",
      "explanation": "Simplify the division to get the final value of 'x'. Alternatively, divide both sides of the equation 30 by 2 to simplify it down to 15, then divide again to get 15 divided by 4, which is -3.75."
    }
  ],
  "final_answer": "x = -3.75"
}

@diegocostares
Copy link

diegocostares commented Sep 2, 2024

The response_format is very necessary. This script is excellent, and I hope it gets implemented in the project because it’s extremely useful!
@alexrudall Do you think it could be included?

@frmsaul
Copy link

frmsaul commented Sep 2, 2024

Strong endorse with @bastos approach

@stephenreid321
Copy link

Hoping this gets included soon!

@adenta
Copy link

adenta commented Oct 1, 2024

I support this. is there a bounty?

@adenta
Copy link

adenta commented Oct 1, 2024

I adapted @jeremedia's script to work with rails. I renamed Schema to BaseSchema as not to conflict with the existing schema.rb.

class BaseSchema
  MAX_OBJECT_PROPERTIES = 100
  MAX_NESTING_DEPTH = 5

  def initialize(name = nil, &block)
    # Use the provided name or derive from class name
    @name = name || self.class.name.split('::').last.downcase
    # Initialize the base schema structure
    @schema = {
      type: 'object',
      properties: {},
      required: [],
      additionalProperties: false,
      strict: true
    }
    @definitions = {}
    # Execute the provided block to define the schema
    instance_eval(&block) if block_given?
    validate_schema
  end

  # Convert the schema to a hash format
  def to_hash
    {
      name: @name,
      description: 'Schema for the structured response',
      schema: @schema.merge({ '$defs' => @definitions })
    }
  end

  private

  # Define a string property
  def string(name, enum: nil, description: nil)
    add_property(name, { type: 'string', enum:, description: }.compact)
  end

  # Define a number property
  def number(name)
    add_property(name, { type: 'number' })
  end

  # Define a boolean property
  def boolean(name)
    add_property(name, { type: 'boolean' })
  end

  # Define an object property
  def object(name, &block)
    properties = {}
    required = []
    BaseSchema.new.tap do |s|
      s.instance_eval(&block)
      properties = s.instance_variable_get(:@schema)[:properties]
      required = s.instance_variable_get(:@schema)[:required]
    end
    add_property(name, { type: 'object', properties:, required:, additionalProperties: false })
  end

  # Define an array property
  def array(name, items:)
    add_property(name, { type: 'array', items: })
  end

  # Define an anyOf property
  def any_of(name, schemas)
    add_property(name, { anyOf: schemas })
  end

  # Define a reusable schema component
  def define(name, &block)
    @definitions[name] = BaseSchema.new(&block).instance_variable_get(:@schema)
  end

  # Reference a defined schema component
  def ref(name)
    { '$ref' => "#/$defs/#{name}" }
  end

  # Add a property to the schema
  def add_property(name, definition)
    @schema[:properties][name] = definition
    @schema[:required] << name
  end

  # Validate the schema against defined limits
  def validate_schema
    properties_count = count_properties(@schema)
    raise 'Exceeded maximum number of object properties' if properties_count > MAX_OBJECT_PROPERTIES

    max_depth = calculate_max_depth(@schema)
    raise 'Exceeded maximum nesting depth' if max_depth > MAX_NESTING_DEPTH
  end

  # Count the total number of properties in the schema
  def count_properties(schema)
    return 0 unless schema.is_a?(Hash) && schema[:properties]

    count = schema[:properties].size
    schema[:properties].each_value do |prop|
      count += count_properties(prop)
    end
    count
  end

  # Calculate the maximum nesting depth of the schema
  def calculate_max_depth(schema, current_depth = 1)
    return current_depth unless schema.is_a?(Hash) && schema[:properties]

    max_child_depth = schema[:properties].values.map do |prop|
      calculate_max_depth(prop, current_depth + 1)
    end.max
    [current_depth, max_child_depth].max
  end
end
require 'json'
require 'dry-schema'
require 'ostruct'

# Client class for interacting with OpenAI API
class OpenAISchemaClient
  def initialize
    OpenAI.configure do |config|
      config.access_token = ENV['OPENPIPE_ACCESS_TOKEN']
      config.uri_base = 'https://app.openpipe.ai/api/v1'
      config.log_errors = true
    end
    @client = OpenAI::Client.new
  end

  # Send a request to OpenAI API and parse the response
  def parse(model:, messages:, response_format:)
    response = @client.chat(
      parameters: {
        model:,
        messages:,
        response_format: {
          type: 'json_schema',
          json_schema: response_format.to_hash
        }
      }
    )

    content = JSON.parse(response['choices'][0]['message']['content'])

    if response['choices'][0]['message']['refusal']
      OpenStruct.new(refusal: response['choices'][0]['message']['refusal'], parsed: nil)
    else
      OpenStruct.new(refusal: nil, parsed: content)
    end
  end
end

# example usage:

# begin
#   # Create an OpenAI client
#   client = OpenAISchemaClient.new
#   # Create an instance of the MathReasoning schema
#   schema = MathReasoning.new

#   # Send a request to OpenAI API
#   result = client.parse(
#     model: 'gpt-4o-2024-08-06',
#     messages: [
#       { role: 'system', content: 'You are a helpful math tutor. Guide the user through the solution step by step.' },
#       { role: 'user', content: 'how can I solve 8x + 7 = -23' }
#     ],
#     response_format: schema
#   )

#   # Handle the response
#   if result.refusal
#     puts "The model refused to respond: #{result.refusal}"

#   else
#     puts JSON.pretty_generate(result.parsed)

#   end
# rescue StandardError => e
#   puts "Error: #{e}"
# end

class MathReasoning < BaseSchema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants