Migrate from OpenAI API to Claude (Anthropic SDK): Step-by-Step Guide

Q: Does Claude support `response_format: {type: "json_object"}`?

Claude does not use the response_format parameter. Instead, instruct Claude in the system prompt to return JSON and optionally use tool use with a structured input_schema to guarantee structure. See the Structured Outputs guide.

Q: How do I handle the missing `n` parameter (multiple completions)?

OpenAI's n parameter returns multiple completions in one call. Claude does not support this — make N separate API calls or use async parallelism to get multiple outputs.

Migrating from the OpenAI API to Claude takes 30–60 minutes for most projects — the REST interfaces are similar, but the message format, system prompt placement, tool definitions, and streaming event names differ. This guide maps every OpenAI API concept to its Claude equivalent and provides side-by-side code comparisons for Python and TypeScript so you can migrate line by line.

Why Migrate to Claude

200K token context window — versus 128K for GPT-4o
Better instruction following — Claude consistently follows complex structured output rules
Lower cost at scale — Claude Haiku is significantly cheaper than GPT-4o-mini for many workloads
Prompt caching — cache static system prompts at 10% of input cost, reducing cost by up to 90% on repeated calls (see Claude API Cost and Prompt Caching)

Model Mapping

OpenAI Model	Claude Equivalent	Notes
gpt-4o	claude-sonnet-4-5	Best everyday model, similar quality
gpt-4o-mini	claude-haiku-4-5	Fastest/cheapest for simple tasks
o1, o3	claude-opus-4-5	Complex reasoning, highest capability
gpt-3.5-turbo	claude-haiku-4-5	Legacy fast model replacement

Python Migration

Install

# Remove OpenAI
pip uninstall openai

# Install Anthropic
pip install anthropic

Basic Completion

OpenAI:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user",   "content": "Summarize this article: ..."}
    ],
    max_tokens=500,
    temperature=0.7,
)
print(response.choices[0].message.content)

Claude equivalent:

import anthropic

client = anthropic.Anthropic(api_key="sk-ant-...")

response = client.messages.create(
    model="claude-sonnet-4-5",
    system="You are a helpful assistant.",   # system is a top-level param, NOT in messages
    messages=[
        {"role": "user", "content": "Summarize this article: ..."}
    ],
    max_tokens=500,
    # Note: temperature defaults to 1.0; range is 0-1
)
print(response.content[0].text)

Key differences:

system is a top-level parameter, not a message with role: "system"
Response text is at response.content[0].text, not response.choices[0].message.content
max_tokens is required (no default)
Temperature range: OpenAI uses 0–2, Claude uses 0–1

TypeScript / Node.js Migration

npm uninstall openai
npm install @anthropic-ai/sdk

OpenAI TypeScript:

import OpenAI from "openai";

const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "system", content: "You are a helpful assistant." },
    { role: "user",   content: "Write a product description for..." }
  ],
  max_tokens: 300,
});

const text = response.choices[0].message.content;

Claude TypeScript:

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

const response = await client.messages.create({
  model: "claude-sonnet-4-5",
  system: "You are a helpful assistant.",
  messages: [
    { role: "user", content: "Write a product description for..." }
  ],
  max_tokens: 300,
});

const text = response.content[0].type === "text" ? response.content[0].text : "";

Streaming Migration

OpenAI streaming:

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True
)
for chunk in stream:
    delta = chunk.choices[0].delta.content
    if delta:
        print(delta, end="", flush=True)

Claude streaming:

with client.messages.stream(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Tell me a story"}],
    max_tokens=1024,
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

For raw SSE events (equivalent to iterating chunks manually):

with client.messages.stream(model="claude-sonnet-4-5",
                             messages=[...],
                             max_tokens=1024) as stream:
    for event in stream:
        if hasattr(event, 'delta') and hasattr(event.delta, 'text'):
            print(event.delta.text, end="")

Tool Use Migration

OpenAI function calling:

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
            "type": "object",
            "properties": {
                "location": {"type": "string"}
            },
            "required": ["location"]
        }
    }
}]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Weather in Seoul?"}],
    tools=tools,
    tool_choice="auto"
)

# Check for tool call
if response.choices[0].finish_reason == "tool_calls":
    tool_call = response.choices[0].message.tool_calls[0]
    name  = tool_call.function.name
    args  = json.loads(tool_call.function.arguments)

Claude tool use:

tools = [{
    "name": "get_weather",
    "description": "Get weather for a location",
    "input_schema": {           # "input_schema" not "parameters"
        "type": "object",
        "properties": {
            "location": {"type": "string"}
        },
        "required": ["location"]
    }
}]

response = client.messages.create(
    model="claude-sonnet-4-5",
    messages=[{"role": "user", "content": "Weather in Seoul?"}],
    tools=tools,
    max_tokens=1024
    # No "tool_choice" needed — Claude uses tools automatically
)

# Check for tool call
if response.stop_reason == "tool_use":
    tool_block = next(b for b in response.content if b.type == "tool_use")
    name = tool_block.name
    args = tool_block.input   # already a dict, no json.loads needed

Key differences:

Tool definition uses input_schema not parameters
No wrapper {"type": "function", "function": {...}} — Claude tools are flat
tool_choice is optional (default is auto)
Tool arguments are a dict, not a JSON string

Mid-Article Offer

Migrating and optimizing a Claude API integration? The Agent SDK Cookbook ($49) includes production-ready Python and TypeScript agents with prompt caching, retry logic, cost tracking, and streaming patterns — ready to adapt after migration.

→ Get the Agent SDK Cookbook — $49

Multi-Turn Conversation Migration

OpenAI conversation history:

messages = [
    {"role": "system",    "content": "You are a coding assistant."},
    {"role": "user",      "content": "Write a Python function to sort a list"},
    {"role": "assistant", "content": "Here's a sort function: ..."},
    {"role": "user",      "content": "Now add type hints"},
]

Claude conversation history:

# system is separate — do NOT put it in messages
messages = [
    {"role": "user",      "content": "Write a Python function to sort a list"},
    {"role": "assistant", "content": "Here's a sort function: ..."},
    {"role": "user",      "content": "Now add type hints"},
]

response = client.messages.create(
    model="claude-sonnet-4-5",
    system="You are a coding assistant.",   # separate parameter
    messages=messages,
    max_tokens=1024,
)

Rules for Claude message history:

Must alternate user → assistant → user
First message must be user
No system role in the messages list
No tool role — tool results use {"role": "user", "content": [{"type": "tool_result", ...}]}

Prompt Caching After Migration

Once migrated, add prompt caching to large static system prompts — this can reduce cost by 80–90% for repeated calls:

response = client.messages.create(
    model="claude-sonnet-4-5",
    system=[{
        "type": "text",
        "text": "You are a legal document reviewer. [2000 words of legal context...]",
        "cache_control": {"type": "ephemeral"}
    }],
    messages=[{"role": "user", "content": "Review this contract clause: ..."}],
    max_tokens=1024,
)

The first call writes the cache; subsequent calls read it at 10% of normal input cost. Break-even is 1.28 cache reads. See the full prompt caching cost analysis.

Migration Checklist

# 1. Replace SDK
pip uninstall openai && pip install anthropic

# 2. Update import
# OpenAI:   from openai import OpenAI; client = OpenAI()
# Claude:   import anthropic; client = anthropic.Anthropic()

# 3. Move system prompt out of messages
# 4. Change "parameters" to "input_schema" in tool definitions
# 5. Update response access: choices[0].message.content → content[0].text
# 6. Update finish_reason checks: "stop" → "end_turn", "tool_calls" → "tool_use"
# 7. Add max_tokens (required, no default in Claude)
# 8. Adjust temperature (0-1 scale, not 0-2)
# 9. Update model names (gpt-4o → claude-sonnet-4-5)
# 10. Test streaming (stream.text_stream vs chunk.choices[0].delta.content)

Related Guides

Claude Agent SDK Guide — Complete agent loop patterns for your migrated integration
Claude API Cost and Prompt Caching — Optimize costs after migration
Claude Haiku vs Sonnet vs Opus — Choose the right model for your use case
Structured Outputs and JSON with Claude — Guarantee structured responses

Frequently Asked Questions

Is Claude API drop-in compatible with the OpenAI API?

Not drop-in — the SDKs have different method signatures and the message format differs (system prompt placement, response structure). However, the migration is mechanical and can be scripted. Some providers offer an OpenAI-compatible endpoint for Claude, but the native SDK is recommended for full feature access.

Does Claude support `response_format: {type: "json_object"}`?

Claude does not use the response_format parameter. Instead, instruct Claude in the system prompt to return JSON and optionally use tool use with a structured input_schema to guarantee structure. See the Structured Outputs guide.

How do I handle the missing `n` parameter (multiple completions)?

OpenAI's n parameter returns multiple completions in one call. Claude does not support this — make N separate API calls or use async parallelism to get multiple outputs.

Are embedding models available on Claude?

No — Anthropic does not offer an embeddings API. For RAG pipelines migrating from OpenAI embeddings, continue using text-embedding-3-small from OpenAI or switch to an open-source model (Nomic, BGE). Use Claude only for the generation/reasoning step.

What happens to `logprobs` and `top_logprobs`?

Claude does not expose log probabilities. If your application relies on logprobs for confidence scoring, you'll need an alternative approach — ask Claude to output a confidence score as part of its JSON response.

How long does a migration typically take?

A single-file script: 10–15 minutes. A medium Flask/FastAPI application (5–10 files): 1–2 hours. A production application with streaming, tool use, and caching: 4–8 hours including testing. The Claude Code CLI can automate the mechanical renaming steps.

Go Deeper

Agent SDK Cookbook — $49 — 15 production agent implementations in Python and TypeScript, all using the native Anthropic SDK with prompt caching, retry logic, and cost tracking built in.

→ Get the Agent SDK Cookbook — $49

30-day money-back guarantee. Instant download.