Claude Tool Use: Advanced Patterns for Production

Q: What if Claude ignores my tools completely?

Check that your tool description includes language about when to use it. Claude relies heavily on the description to decide whether a tool is applicable. Also verify that tool_choice is not set to "none".

Tool use (function calling) is how you give Claude access to real-world actions — APIs, databases, file systems. The basic pattern is covered in the agent quickstart. This guide covers the advanced cases: parallel calls, forced invocation, error propagation, streaming with tools, and anti-patterns to avoid in 2026.

Parallel tool calls

Claude can call multiple tools in a single response when it determines they can run concurrently. This is the default behavior when multiple tool calls are independent.

Example: Claude is asked "What is the weather in Seoul and Tokyo?"

Claude returns both tool_use blocks in one response content array. You must handle all of them:

import asyncio
import anthropic

client = anthropic.Anthropic()

# For parallel execution, gather results concurrently
async def call_tool_async(name: str, input_data: dict) -> str:
    # Replace with actual async tool implementations
    await asyncio.sleep(0.1)  # Simulate I/O
    return f"Result for {name}({input_data})"

async def run_parallel_tools(response: anthropic.types.Message) -> list[dict]:
    tool_calls = [b for b in response.content if b.type == "tool_use"]
    
    # Launch all tool calls concurrently
    tasks = [call_tool_async(tc.name, tc.input) for tc in tool_calls]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    
    tool_results = []
    for tc, result in zip(tool_calls, results):
        if isinstance(result, Exception):
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tc.id,
                "content": f"Error: {result}",
                "is_error": True,
            })
        else:
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tc.id,
                "content": str(result),
            })
    return tool_results

Critical rule: all tool_use blocks in a single response must have matching tool_result blocks in the next user message. If you return results for only some tool calls, the API returns a 400 error.

Forced tool choice

By default, Claude decides whether to use a tool. You can override this:

tool_choice: { type: "auto" } (default) — Claude decides.

tool_choice: { type: "any" } — Claude must call at least one tool; it chooses which.

tool_choice: { type: "tool", name: "..." } — Claude must call this specific tool.

tool_choice: { type: "none" } — Claude cannot use tools (answer from training only).

Forced tool for guaranteed structured output

The most reliable way to get structured output from Claude:

def extract_invoice(text: str) -> dict:
    """Extract invoice fields — guaranteed to call extract_invoice_data tool."""
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        tools=[
            {
                "name": "extract_invoice_data",
                "description": "Extract structured invoice data",
                "input_schema": {
                    "type": "object",
                    "properties": {
                        "invoice_number": {"type": "string"},
                        "vendor_name": {"type": "string"},
                        "amount": {"type": "number"},
                        "currency": {"type": "string"},
                        "due_date": {"type": "string", "description": "ISO 8601 date"},
                        "line_items": {
                            "type": "array",
                            "items": {
                                "type": "object",
                                "properties": {
                                    "description": {"type": "string"},
                                    "quantity": {"type": "number"},
                                    "unit_price": {"type": "number"},
                                },
                            },
                        },
                    },
                    "required": ["invoice_number", "vendor_name", "amount"],
                },
            }
        ],
        tool_choice={"type": "tool", "name": "extract_invoice_data"},  # Force it
        messages=[{"role": "user", "content": f"Extract from this invoice:\n\n{text}"}],
    )

    tool_use = next(b for b in response.content if b.type == "tool_use")
    return tool_use.input  # Guaranteed to be the invoice schema

Why this beats JSON mode: forced tool use validates against your JSON schema. If Claude produces output that doesn't match the schema, the API returns a validation error before it reaches your code.

Tool use with streaming

Streaming adds complexity when tool calls are involved. The pattern:

def stream_with_tools(prompt: str) -> str:
    messages = [{"role": "user", "content": prompt}]

    while True:
        # stream=True: tokens arrive as they're generated
        with client.messages.stream(
            model="claude-sonnet-4-6",
            max_tokens=2048,
            tools=MY_TOOLS,
            messages=messages,
        ) as stream:
            # Stream text to user in real time
            partial_text = ""
            for text in stream.text_stream:
                print(text, end="", flush=True)
                partial_text += text

            final = stream.get_final_message()

        if final.stop_reason == "end_turn":
            print()
            return partial_text

        if final.stop_reason == "tool_use":
            print()  # Newline after streamed text
            messages.append({"role": "assistant", "content": final.content})
            tool_results = []
            for block in final.content:
                if block.type == "tool_use":
                    result = run_tool(block.name, block.input)
                    print(f"[Tool: {block.name}] → {result[:60]}")
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })
            messages.append({"role": "user", "content": tool_results})

Tool schemas: what Claude actually needs

The quality of your tool descriptions directly affects how reliably Claude calls them correctly.

Good tool description:

{
  "name": "search_customers",
  "description": "Search customer records by name, email, or company. Returns matching customers sorted by last_activity descending. Use when user asks about a specific customer or needs to find contact information.",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {
        "type": "string",
        "description": "Search query. Matches against: full name, email, company name, phone. Minimum 3 characters."
      },
      "limit": {
        "type": "integer",
        "description": "Max results to return. Default: 10. Max: 100.",
        "default": 10
      }
    },
    "required": ["query"]
  }
}

What makes this good:

Description says when to use it, not just what it does
Parameter descriptions include constraints (min 3 chars, max 100 results)
Default values are documented
Return format is described

Common mistakes:

Tool description that just restates the name ("search_customers: searches customers")
No description of what the parameters accept (Claude will guess)
Missing required array (Claude may omit required params)
Tool name contains spaces or special chars (use snake_case)

Tool call debugging

When tools aren't being called as expected, these techniques help:

1. Log the raw request:

import logging
logging.basicConfig(level=logging.DEBUG)
# The SDK will log full request/response at DEBUG level

2. Check the tool call in the response before dispatching:

for block in response.content:
    if block.type == "tool_use":
        print(f"Tool: {block.name}")
        print(f"Input: {json.dumps(block.input, indent=2)}")

3. Verify your input_schema is valid JSON Schema:

import jsonschema

schema = {"type": "object", "properties": {"query": {"type": "string"}}, "required": ["query"]}
jsonschema.Draft7Validator.check_schema(schema)  # Raises if invalid

4. Test with tool_choice: { type: "any" } to confirm Claude is willing to use tools at all. If it still doesn't call tools, your tool description may not match the task.

Anti-patterns to avoid

Anti-pattern 1: Too many tools at once

Claude's reasoning quality degrades when given more than ~20 tools in a single call. For large tool sets, pre-filter to the relevant subset:

def get_relevant_tools(user_query: str, all_tools: list) -> list:
    """Return the 5-10 most relevant tools for this query."""
    # Simple approach: keyword matching
    # Production: semantic similarity with embeddings
    query_lower = user_query.lower()
    scored = [
        (tool, sum(w in query_lower for w in tool["name"].split("_")))
        for tool in all_tools
    ]
    scored.sort(key=lambda x: x[1], reverse=True)
    return [t for t, _ in scored[:10]]

Anti-pattern 2: Catching all tool errors silently

If your tool handler catches all exceptions and returns empty strings, Claude has no information to work with and will hallucinate results. Return informative error messages.

Anti-pattern 3: Modifying tool schemas between turns

Changing the available tools mid-conversation confuses the model. Keep the same tool list throughout a conversation.

Anti-pattern 4: Very long tool results

Tool results that are tens of thousands of tokens consume context window and increase cost. Truncate large results:

MAX_TOOL_RESULT_TOKENS = 8000  # ~32K chars

def safe_tool_result(content: str) -> str:
    if len(content) > MAX_TOOL_RESULT_TOKENS * 4:  # rough char estimate
        return content[:MAX_TOOL_RESULT_TOKENS * 4] + "\n[...truncated]"
    return content

FAQ

Can I have Claude call the same tool twice in one turn? Yes. Claude can call any tool multiple times in parallel within a single response. Each call gets a unique tool_use_id.

What if Claude ignores my tools completely? Check that your tool description includes language about when to use it. Claude relies heavily on the description to decide whether a tool is applicable. Also verify that tool_choice is not set to "none".

Can I stream tool results? No. Tool results are returned as complete strings. You can stream the model's response after receiving tool results.

What is the token cost of tool definitions? Each tool definition counts as input tokens. A typical tool with full description and schema is 100-300 tokens. Ten tools add ~1,000-3,000 tokens per request. Use prompt caching on tool definitions for high-volume applications.

Can I nest tool calls (have a tool that calls Claude)? Yes — a tool handler can call the Claude API internally. Be careful with recursion depth and total cost. This is the pattern for meta-agents and reflection chains.

Sources

Anthropic tool use documentation — April 2026
Tool use overview — Anthropic — April 2026
Related: Claude Agent SDK Guide — building full agents with tool use

Frequently Asked Questions

How do I force Claude to always call a specific tool instead of answering from memory?

Set tool_choice={"type": "tool", "name": "your_tool_name"} in the API call. This guarantees Claude calls that tool and returns a structured response matching your JSON schema — even for questions it could answer from training data. This is the most reliable way to get guaranteed structured output from Claude.

Why is Claude not calling my tools even though they are defined?

The most common cause is a vague or missing tool description. Claude decides whether to call a tool based almost entirely on the description field — it should state when to use the tool, not just what it does. Also verify tool_choice is not set to "none", and test with tool_choice: {"type": "any"} to confirm the model is willing to use tools at all.

What happens if I only return results for some tool calls when Claude calls multiple tools at once?

The API returns a 400 error. When Claude calls multiple tools in a single response, every tool_use block must have a matching tool_result block in the next user message. Collect all results — even error results — before sending the next API call.

How do I reduce token costs when using many tools?

Cache your tool definitions. Each tool definition counts as input tokens (typically 100–300 tokens per tool). Adding cache_control: {"type": "ephemeral"} to your last tool definition in the array caches all preceding tool definitions. At 10+ tools and high request volume, this can save thousands of tokens per request.

Take It Further

Claude Agent SDK Cookbook: 40 Production Patterns — 40 battle-tested patterns for Claude agents in production. Retry logic, tool error recovery, parallel sub-agents, cost guardrails, deterministic testing.

Complete, runnable Python and TypeScript code throughout.

→ Get Agent SDK Cookbook — $49

30-day money-back guarantee. Instant download.