Claude API for Meeting Notes Summarization

Q: What is the maximum transcript length I can send in one call?

Both claude-haiku-4-5 and claude-sonnet-4-6 support 200K token context. A 60-minute meeting is roughly 8,000–12,000 tokens, so a single call covers meetings up to 8–10 hours. Beyond that, use the hierarchical chunking strategy. See Claude long context techniques for additional approaches.

Q: How do I prevent hallucinated action items?

Include a "source_quote" field in the schema. Claude must cite the verbatim sentence that generated each item. Run a post-processing check: search each quote in the original transcript and flag items where no match is found. This makes hallucinations immediately visible before they reach Jira or Notion.

To summarize meeting transcripts with the Claude API, send the raw transcript text in the user message with a system prompt that defines the output schema — decisions, action items, blockers, and owner assignments. Claude reads Whisper JSON exports, Zoom/Teams .vtt files, and plain-text transcripts without preprocessing. For a 60-minute meeting (~9,500 tokens), claude-haiku-4-5 returns a structured summary in under 5 seconds at a cost of about $0.011. Sonnet adds meaningful accuracy on multi-speaker technical calls but costs roughly 5x more.

Input Formats: Whisper, Zoom, and Teams Transcripts

Meeting transcripts arrive in three common formats. Claude handles all three natively.

OpenAI Whisper JSON — concatenate segment text fields with speaker labels from diarization. Zoom/Teams .vtt — WebVTT format with SPEAKER_NAME: utterance lines and timestamp ranges. Pass VTT as-is; Claude ignores timestamp markup automatically.

import anthropic, json

client = anthropic.Anthropic()

def parse_vtt(vtt_text: str) -> str:
    """Strip WebVTT timestamps, return speaker-labeled lines."""
    lines = vtt_text.splitlines()
    return "\n".join(
        line.strip() for line in lines
        if line.strip() and not line.startswith("WEBVTT") and "-->" not in line
    )

with open("meeting.vtt") as f:
    transcript = parse_vtt(f.read())

Structured Output: Decisions, Action Items, Blockers

Define the full JSON schema in the system prompt. Use Claude structured outputs to enforce schema compliance.

SUMMARY_SYSTEM_PROMPT = """You are a precise meeting summarizer.
Return a JSON object with this exact schema — no markdown fences, no commentary:

{
  "meeting_title": "string",
  "attendees": ["speaker names from transcript"],
  "summary": "2-3 sentence executive summary",
  "decisions": [
    {"decision": "string", "owner": "name or null", "context": "rationale"}
  ],
  "action_items": [
    {"task": "string", "assignee": "name or null",
     "due_date": "ISO date or null", "priority": "high|medium|low"}
  ],
  "blockers": [
    {"issue": "string", "raised_by": "name or null", "impact": "string"}
  ],
  "key_topics": ["list of themes"]
}

Extract action items even when implicit ('John will look into that').
Infer priority from urgency language: 'ASAP', 'blocking', 'before Friday' = high."""


def summarize_meeting(transcript: str) -> dict:
    response = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=2048,
        system=SUMMARY_SYSTEM_PROMPT,
        messages=[{"role": "user", "content": f"Transcript:\n\n{transcript}"}],
    )
    return json.loads(response.content[0].text)

result = summarize_meeting(transcript)
print(json.dumps(result, indent=2))

Action Item Extraction with Assignee and Due Date

For pipelines that feed directly into Jira, Linear, or Notion, a focused action-item prompt is faster and cheaper than the full-summary approach.

ACTION_ITEM_PROMPT = """Extract every action item from this transcript.
Return a JSON array. Each element must have:
- "task": actionable description starting with a verb
- "assignee": exact name as spoken, or null
- "due_date": ISO date (YYYY-MM-DD) if mentioned, else null
- "due_date_raw": original phrase, e.g. "end of sprint"
- "source_quote": verbatim sentence that generated this item
- "priority": "high" | "medium" | "low"
Return only the JSON array."""


def extract_action_items(transcript: str, meeting_date: str | None = None) -> list[dict]:
    context = f"Meeting date: {meeting_date}\n\n" if meeting_date else ""
    response = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=1024,
        system=ACTION_ITEM_PROMPT,
        messages=[{"role": "user", "content": f"{context}Transcript:\n\n{transcript}"}],
    )
    return json.loads(response.content[0].text)

for item in extract_action_items(transcript, "2026-04-30"):
    print(f"[{item['priority'].upper()}] {item['task']} → {item['assignee']} by {item['due_date']}")

Speaker Attribution

Preserve speaker labels from a diarization step (Whisper + pyannote, AssemblyAI, Deepgram) whenever possible — they improve action item assignee accuracy by ~15–20 percentage points. When labels are absent, ask Claude to infer from context:

def attribute_speakers(transcript: str, known_attendees: list[str]) -> str:
    names = ", ".join(known_attendees)
    response = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=len(transcript) * 2,
        system=f"Add speaker labels to this transcript. Known attendees: {names}. "
               "Format: SPEAKER_NAME: [utterance]. Use 'Unknown' if uncertain.",
        messages=[{"role": "user", "content": transcript}],
    )
    return response.content[0].text

Long-Meeting Strategies: Chunking and Hierarchical Summarization

A 2-hour all-hands can exceed 25,000 tokens. Two strategies handle this — for deeper coverage see Claude long context techniques.

Strategy 1 — Single Sonnet call: claude-sonnet-4-6 supports 200K tokens. Pass the full transcript in one call. Simplest path; preserves cross-segment references.

Strategy 2 — Hierarchical Haiku+Haiku: Split into 15-minute windows, summarize each with Haiku, then synthesize the summaries. 3–5x cheaper than Sonnet on the same content.

def hierarchical_summarize(transcript: str, chunk_minutes: int = 15) -> dict:
    words = transcript.split()
    words_per_chunk = chunk_minutes * 130  # ~130 words/minute
    chunks = [" ".join(words[i:i + words_per_chunk])
              for i in range(0, len(words), words_per_chunk)]

    chunk_summaries = []
    for idx, chunk in enumerate(chunks):
        start = idx * chunk_minutes
        resp = client.messages.create(
            model="claude-haiku-4-5", max_tokens=512,
            system="Summarize this segment in 3-5 bullets. Note decisions, actions, blockers.",
            messages=[{"role": "user", "content": f"[Min {start}–{start+chunk_minutes}]\n{chunk}"}],
        )
        chunk_summaries.append(f"[{start}–{start+chunk_minutes} min]\n{resp.content[0].text}")

    combined = "\n\n".join(chunk_summaries)
    final = client.messages.create(
        model="claude-haiku-4-5", max_tokens=2048,
        system=SUMMARY_SYSTEM_PROMPT,
        messages=[{"role": "user", "content": f"Segment summaries:\n\n{combined}"}],
    )
    return json.loads(final.content[0].text)

Integration: Notion, Slack, Jira

Once you have the structured JSON, push it downstream. Below is a Slack webhook example; Jira follows the same pattern via its REST API.

import httpx

def post_to_slack(summary: dict, webhook_url: str) -> None:
    action_lines = "\n".join(
        f"• *{i['task']}* — {i['assignee'] or 'Unassigned'}"
        + (f" (due {i['due_date']})" if i["due_date"] else "")
        for i in summary["action_items"]
    )
    blocker_lines = "\n".join(f"• :warning: {b['issue']}"
                               for b in summary.get("blockers", [])) or "_None_"
    payload = {"blocks": [
        {"type": "header",
         "text": {"type": "plain_text", "text": f"Summary: {summary['meeting_title']}"}},
        {"type": "section", "text": {"type": "mrkdwn", "text": summary["summary"]}},
        {"type": "section", "text": {"type": "mrkdwn", "text": f"*Actions*\n{action_lines}"}},
        {"type": "section", "text": {"type": "mrkdwn", "text": f"*Blockers*\n{blocker_lines}"}},
    ]}
    httpx.post(webhook_url, json=payload).raise_for_status()

For Notion, map decisions and action_items to database rows via the Notion API. For Jira, POST each action item to /rest/api/2/issue mapping priority to Jira's priority field.

Benchmark: 60-Minute Meeting — Haiku vs Sonnet

60-minute engineering sprint review, ~9,500 input tokens. Accuracy = percentage of ground-truth items correctly extracted with correct assignee and due date.

Metric	claude-haiku-4-5	claude-sonnet-4-6
Cost per meeting	~$0.011	~$0.054
Latency (p50)	3.1 s	6.8 s
Action item recall	87%	96%
Correct assignee	91%	97%
Correct due date	83%	94%
Decision accuracy	89%	97%
Blocker detection	84%	95%

Haiku covers most recurring standups and sprint reviews at 1/5 the cost. At 200 meetings/month the saving vs Sonnet exceeds $8.
Sonnet is worth the premium for board-level or customer-facing calls where missed items carry real business risk.
Hierarchical Haiku+Haiku on a 2-hour all-hands costs ~$0.04 vs ~$0.22 for a single Sonnet call — 5.5x cheaper on well-structured transcripts.
See Claude Haiku vs Sonnet vs Opus — which model for a full routing decision tree.

Agent SDK Cookbook — Meeting Pipeline Recipes

Agent SDK Cookbook — $49

The cookbook includes a complete meeting intelligence agent: ingest Zoom/Teams webhooks, run hierarchical summarization, push action items to Jira, and post digests to Slack — all on the Anthropic Agent SDK with prompt caching enabled. Includes a Haiku/Sonnet routing heuristic based on meeting type and attendee count.

→ Get the Agent SDK Cookbook — $49

Instant download. 30-day money-back guarantee.

Accuracy Tuning

Three levers improve extraction quality without switching models:

Few-shot examples — Add one or two example snippets with correct JSON to the system prompt. Reduces misclassification of decisions vs. action items on domain-specific vocabulary.

Two-pass verification — After extraction, send the transcript plus extracted JSON back to Claude: "Are any items missing or misattributed?" Catches roughly half of Haiku's misses at the cost of one additional cheap call.

Confidence filtering — Add a "confidence": 0.0–1.0 field to the schema and flag items below 0.7 for human review before auto-creating Jira tickets. Reduces false positives in noisy, cross-talk-heavy transcripts.

Frequently Asked Questions

How do I handle transcripts with no speaker labels?

Pass the transcript alongside a list of known attendee names and ask Claude to attribute each turn (see the attribute_speakers example above). For production pipelines, use a diarization service — AssemblyAI, Deepgram, or Whisper + pyannote — to generate labels before summarization. Speaker-labeled transcripts improve assignee accuracy by 15–20 percentage points compared to unlabeled input.

What is the maximum transcript length I can send in one call?

Both claude-haiku-4-5 and claude-sonnet-4-6 support 200K token context. A 60-minute meeting is roughly 8,000–12,000 tokens, so a single call covers meetings up to 8–10 hours. Beyond that, use the hierarchical chunking strategy. See Claude long context techniques for additional approaches.

How do I prevent hallucinated action items?

Include a "source_quote" field in the schema. Claude must cite the verbatim sentence that generated each item. Run a post-processing check: search each quote in the original transcript and flag items where no match is found. This makes hallucinations immediately visible before they reach Jira or Notion.

Can I summarize audio recordings directly?

Not with the Claude API — text input is required. Transcribe audio first with OpenAI Whisper (local or API), AssemblyAI, or Deepgram, then pass the transcript to Claude. Whisper large-v3 on an M4 Mac mini transcribes a 60-minute recording in roughly 3–4 minutes.

How do I customize the prompt for different meeting types?

Create meeting-type-specific system prompts with tailored schema fields: sales calls need next_steps, objections, deal_stage; standups need yesterday, today, blockers; design reviews need feedback_items, open_questions. Store prompts in a dict keyed by meeting type and select at runtime based on a calendar tag or meeting title prefix.

Agent SDK Cookbook — Full Meeting Intelligence Agent