Structured Outputs
Generate structured responses from LLMs using custom JSON schemas.
Overview
Structured outputs allow you to constrain LLM responses to follow specific formats, making it easier to integrate AI into your applications. Instead of dealing with free-form text, you can get responses in predictable JSON structures that match your application's data models.
Use Cases
Form Data Extraction
Extract structured information from natural language inputs, perfect for:
- User registration forms
- Booking systems
- Support ticket classification
- Customer feedback analysis
Data Parsing
Transform unstructured data into structured formats for:
- Meeting notes summarization
- Sales lead qualification
- Document processing
- Event extraction from text
Basic Example: Hotel Booking System
Here's how to extract booking information from natural language queries:
from pydantic import BaseModel
from openai import OpenAI
# Define the schemas
class UserDetails(BaseModel):
name: str
booking_id: str
class BookingDetails(BaseModel):
check_in_date: str
check_out_date: str
free_breakfast: bool
# Initialize the client
client = OpenAI(
base_url="api.brilliantai.co",
api_key="<YOUR_BRILLIANTAI_API_KEY>"
)
# Extract user details from natural language
user_query = """
Hello, my name is Jane Smith. My booking ID is 2345452xyz.
Do I get free breakfast with my stay?
"""
completion = client.beta.chat.completions.parse(
model="llama-3.3-70b",
messages=[
{
"role": "system",
"content": "Extract the name and booking ID in the provided format."
},
{
"role": "user",
"content": user_query
}
],
response_format=UserDetails
)
user_details = completion.choices[0].parsed.model_dump_json(indent=4)
print("Extracted User Details:")
print(user_details)
This outputs:
{
"name": "Jane Smith",
"booking_id": "2345452xyz"
}
Returning Booking Information
After looking up the booking details, you can format the response:
# Simulate booking lookup
booking_info = {
"check_in_date": "04/04/2024",
"check_out_date": "07/04/2024",
"free_breakfast": False
}
completion = client.beta.chat.completions.parse(
model="llama-3.3-70b",
messages=[
{
"role": "system",
"content": "Use these provided details to answer the user's queries in the required format."
},
{
"role": "user",
"content": user_query
}
],
response_format=BookingDetails
)
booking_details = completion.choices[0].parsed.model_dump_json(indent=4)
print("\nBooking Details Response:")
print(booking_details)
Advanced Example: Sales Notes Analysis
Extract structured information from unstructured sales notes:
from typing import List
from pydantic import BaseModel
class SalesSummary(BaseModel):
date: str
prospects: List[str]
key_points: List[str]
action_items: List[str]
sales_note = """
Date: 30/10/2024. Just had a great call with Sarah Johnson (VP of Engineering)
and Mike Peters (Senior DevOps Lead) from TechFlow Solutions. They're really
struggling with their current CI/CD pipeline - apparently deployments are taking
4+ hours and they're having constant issues with their test environment. Sarah
mentioned budget of ~$150-200k for the year, but needs board approval for
anything over $175k. They're also talking to Jenkins Enterprise and GitLab,
but she said their solutions seem overly complex for their needs. Mike was
particularly excited about our automated rollback feature and container
orchestration. Asked for case studies from other fintech companies since
they handle sensitive financial data.
"""
completion = client.beta.chat.completions.parse(
model="llama-3.3-70b",
messages=[
{
"role": "system",
"content": "Extract structured information from the sales notes."
},
{
"role": "user",
"content": sales_note
}
],
response_format=SalesSummary
)
sales_summary = completion.choices[0].parsed.model_dump_json(indent=4)
print(sales_summary)
Error Handling
Model Refusal
Sometimes the model might refuse to generate a response for safety reasons:
completion = client.beta.chat.completions.parse(
model="llama-3.3-70b",
messages=[...],
response_format=YourSchema
)
response = completion.choices[0].message
if response.refusal:
print(f"Model refused to respond: {response.refusal}")
else:
print(response.parsed)
Token Limits
Handle cases where responses might be truncated due to token limits:
try:
completion = client.beta.chat.completions.parse(
model="llama-3.3-70b",
messages=[...],
response_format=YourSchema,
max_tokens=100
)
response = completion.choices[0].message
if response.parsed:
print(response.parsed)
elif response.refusal:
print(f"Model refused: {response.refusal}")
except Exception as e:
if isinstance(e, openai.LengthFinishReasonError):
print(f"Response truncated: {e}")
else:
print(f"Error: {e}")
Supported Types
The structured output feature supports the following JSON Schema types:
- String: Text values
- Number: Floating-point numbers
- Integer: Whole numbers
- Boolean: True/false values
- Object: Nested structures
- Array: Lists of values
- Enum: Predefined set of values
- anyOf: Union types
Root objects must be of type Object (not anyOf). All fields are required by default - use unions with null for optional fields.
Best Practices
- Keep Schemas Simple: Start with simple schemas and add complexity gradually
- Validate Responses: Always handle potential errors and edge cases
- Use Type Hints: Leverage Python's type system with Pydantic for better code quality
- Document Schemas: Keep schema definitions clear and well-documented
- Handle Failures Gracefully: Implement proper error handling for model refusals and token limits