Native SDK support for Structured Outputs #508

kirby-jobber · 2024-08-06T23:56:21Z

The Problem

We all hate trying to coax ChatGPT into adhering to a JSON schema. OpenAI decided to make that easier for us.

The flow:

Declare the data transfer objects (DTOs) that you want from the model.
Demand the response exactly in your format.
Parse the response with code instead of godawful regex/string extractor

It would be really nice to have native support within this ruby gem!

Prior Art

In python, a nice example with math Q&A

from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: list[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    print(message.parsed.steps)
    print(message.parsed.final_answer)
else:
    print(message.refusal)

Potential Solve

Not sure what the ideal pydantic replacement would be, but perhaps dry-struct? DTO declarations could look like this:

require 'dry-struct'
require 'dry-types'

module Types
  include Dry.Types()
end

class Step < Dry::Struct
  attribute :explanation, Types::String
  attribute :output, Types::String
end

class MathResponse < Dry::Struct
  attribute :steps, Types::Array.of(Step)
  attribute :final_answer, Types::String
end

(Not sure if the gem should handle any schema validations, since that's purportedly OpenAI's job, but there's dry-validations if so.)

The rest of OpenAI's math tutor example might look like

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor."},
      { role: "user", content: "solve 8x + 31 = 2"},
    ],
    response_format: MathResponse
  }
)

message = completion.dig("choices", 0, "message")
if message.parsed
  message.parsed.steps.each do |step|
    puts step.explanation
    puts step.output
  end
  puts message.parsed.final_answer
else
  puts message.refusal
end

The text was updated successfully, but these errors were encountered:

bastos · 2024-08-07T01:24:10Z

I like dry-rb and use it daily, but I think it should be bare-bones and let the developer create abstractions on top of it using dry-rb, Sorbet, Active Record etc. At least make passing and returning dry-rb objects optional.

Example:

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor." },
      { role: "user", content: "solve 8x + 31 = 2" },
    ],
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "math_response",
        strict: true,
        schema: {
          type: "object",
          properties: {
            steps: {
              type: "array",
              items: {
                type: "object",
                properties: {
                  explanation: {
                    type: "string",
                  },
                  output: {
                    type: "string",
                  },
                },
                required: ["explanation", "output"],
                additionalProperties: false,
              },
            },
            final_answer: {
              type: "string",
            },
          },
          required: ["steps", "final_answer"],
          additionalProperties: false,
        },
      },
    },
  }
)

And if response_format is passed, the library should use JSON.parse on the output.

jeremedia · 2024-08-07T22:04:57Z

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

bastos · 2024-08-08T06:08:44Z

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

And the Node official library has support for Zod. So, making an argument against my previous comment, maybe it should support dry's Dry::Schema.JSON (optionally, I hope).

jeremedia · 2024-08-15T18:28:14Z

Here's a ruby script implementing StructuredOutputs::Schema that provides the functionality OpenAI added to their python SDK for super-simple structured output definitions. Replicates their cookbook example perfectly.

Key thing is this simplicity:

class MathReasoning < StructuredOutputs::Schema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

schema = MathReasoning.new
  
result = client.parse(
  model: "gpt-4o-2024-08-06",
  messages: [
    { role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
    { role: "user", content: "how can I solve 8x + 7 = -23" }
  ],
  response_format: schema
)

To get this:

{
  "steps": [
    {
      "expression": "8x + 7 = -23",
      "explanation": "This is your starting equation."
    },
    {
      "expression": "8x = -23 - 7",
      "explanation": "Subtract 7 from both sides to isolate the term with 'x' on the left side."
    },
    {
      "expression": "8x = -30",
      "explanation": "Simplify the right side by calculating -23 - 7, which is -30."
    },
    {
      "expression": "x = -30 / 8",
      "explanation": "Divide both sides by 8 to solve for 'x'."
    },
    {
      "expression": "x = -3.75",
      "explanation": "Simplify the division to get the final value of 'x'. Alternatively, divide both sides of the equation 30 by 2 to simplify it down to 15, then divide again to get 15 divided by 4, which is -3.75."
    }
  ],
  "final_answer": "x = -3.75"
}

diegocostares · 2024-09-02T05:14:41Z

The response_format is very necessary. This script is excellent, and I hope it gets implemented in the project because it’s extremely useful!
@alexrudall Do you think it could be included?

frmsaul · 2024-09-02T21:53:32Z

Strong endorse with @bastos approach

stephenreid321 · 2024-09-21T19:27:59Z

Hoping this gets included soon!

adenta · 2024-10-01T15:20:44Z

I support this. is there a bounty?

adenta · 2024-10-01T15:34:01Z

I adapted @jeremedia's script to work with rails. I renamed Schema to BaseSchema as not to conflict with the existing schema.rb.

class BaseSchema
  MAX_OBJECT_PROPERTIES = 100
  MAX_NESTING_DEPTH = 5

  def initialize(name = nil, &block)
    # Use the provided name or derive from class name
    @name = name || self.class.name.split('::').last.downcase
    # Initialize the base schema structure
    @schema = {
      type: 'object',
      properties: {},
      required: [],
      additionalProperties: false,
      strict: true
    }
    @definitions = {}
    # Execute the provided block to define the schema
    instance_eval(&block) if block_given?
    validate_schema
  end

  # Convert the schema to a hash format
  def to_hash
    {
      name: @name,
      description: 'Schema for the structured response',
      schema: @schema.merge({ '$defs' => @definitions })
    }
  end

  private

  # Define a string property
  def string(name, enum: nil, description: nil)
    add_property(name, { type: 'string', enum:, description: }.compact)
  end

  # Define a number property
  def number(name)
    add_property(name, { type: 'number' })
  end

  # Define a boolean property
  def boolean(name)
    add_property(name, { type: 'boolean' })
  end

  # Define an object property
  def object(name, &block)
    properties = {}
    required = []
    BaseSchema.new.tap do |s|
      s.instance_eval(&block)
      properties = s.instance_variable_get(:@schema)[:properties]
      required = s.instance_variable_get(:@schema)[:required]
    end
    add_property(name, { type: 'object', properties:, required:, additionalProperties: false })
  end

  # Define an array property
  def array(name, items:)
    add_property(name, { type: 'array', items: })
  end

  # Define an anyOf property
  def any_of(name, schemas)
    add_property(name, { anyOf: schemas })
  end

  # Define a reusable schema component
  def define(name, &block)
    @definitions[name] = BaseSchema.new(&block).instance_variable_get(:@schema)
  end

  # Reference a defined schema component
  def ref(name)
    { '$ref' => "#/$defs/#{name}" }
  end

  # Add a property to the schema
  def add_property(name, definition)
    @schema[:properties][name] = definition
    @schema[:required] << name
  end

  # Validate the schema against defined limits
  def validate_schema
    properties_count = count_properties(@schema)
    raise 'Exceeded maximum number of object properties' if properties_count > MAX_OBJECT_PROPERTIES

    max_depth = calculate_max_depth(@schema)
    raise 'Exceeded maximum nesting depth' if max_depth > MAX_NESTING_DEPTH
  end

  # Count the total number of properties in the schema
  def count_properties(schema)
    return 0 unless schema.is_a?(Hash) && schema[:properties]

    count = schema[:properties].size
    schema[:properties].each_value do |prop|
      count += count_properties(prop)
    end
    count
  end

  # Calculate the maximum nesting depth of the schema
  def calculate_max_depth(schema, current_depth = 1)
    return current_depth unless schema.is_a?(Hash) && schema[:properties]

    max_child_depth = schema[:properties].values.map do |prop|
      calculate_max_depth(prop, current_depth + 1)
    end.max
    [current_depth, max_child_depth].max
  end
end

require 'json'
require 'dry-schema'
require 'ostruct'

# Client class for interacting with OpenAI API
class OpenAISchemaClient
  def initialize
    OpenAI.configure do |config|
      config.access_token = ENV['OPENPIPE_ACCESS_TOKEN']
      config.uri_base = 'https://app.openpipe.ai/api/v1'
      config.log_errors = true
    end
    @client = OpenAI::Client.new
  end

  # Send a request to OpenAI API and parse the response
  def parse(model:, messages:, response_format:)
    response = @client.chat(
      parameters: {
        model:,
        messages:,
        response_format: {
          type: 'json_schema',
          json_schema: response_format.to_hash
        }
      }
    )

    content = JSON.parse(response['choices'][0]['message']['content'])

    if response['choices'][0]['message']['refusal']
      OpenStruct.new(refusal: response['choices'][0]['message']['refusal'], parsed: nil)
    else
      OpenStruct.new(refusal: nil, parsed: content)
    end
  end
end

# example usage:

# begin
#   # Create an OpenAI client
#   client = OpenAISchemaClient.new
#   # Create an instance of the MathReasoning schema
#   schema = MathReasoning.new

#   # Send a request to OpenAI API
#   result = client.parse(
#     model: 'gpt-4o-2024-08-06',
#     messages: [
#       { role: 'system', content: 'You are a helpful math tutor. Guide the user through the solution step by step.' },
#       { role: 'user', content: 'how can I solve 8x + 7 = -23' }
#     ],
#     response_format: schema
#   )

#   # Handle the response
#   if result.refusal
#     puts "The model refused to respond: #{result.refusal}"

#   else
#     puts JSON.pretty_generate(result.parsed)

#   end
# rescue StandardError => e
#   puts "Error: #{e}"
# end


class MathReasoning < BaseSchema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

jeremedia mentioned this issue Oct 25, 2024

Clean Structured Output Support #541

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native SDK support for Structured Outputs #508

Native SDK support for Structured Outputs #508

kirby-jobber commented Aug 6, 2024

bastos commented Aug 7, 2024

jeremedia commented Aug 7, 2024

bastos commented Aug 8, 2024

jeremedia commented Aug 15, 2024 •

edited

Loading

diegocostares commented Sep 2, 2024 •

edited

Loading

frmsaul commented Sep 2, 2024

stephenreid321 commented Sep 21, 2024

adenta commented Oct 1, 2024

adenta commented Oct 1, 2024

Native SDK support for Structured Outputs #508

Native SDK support for Structured Outputs #508

Comments

kirby-jobber commented Aug 6, 2024

The Problem

Prior Art

Potential Solve

bastos commented Aug 7, 2024

jeremedia commented Aug 7, 2024

bastos commented Aug 8, 2024

jeremedia commented Aug 15, 2024 • edited Loading

diegocostares commented Sep 2, 2024 • edited Loading

frmsaul commented Sep 2, 2024

stephenreid321 commented Sep 21, 2024

adenta commented Oct 1, 2024

adenta commented Oct 1, 2024

jeremedia commented Aug 15, 2024 •

edited

Loading

diegocostares commented Sep 2, 2024 •

edited

Loading