Context Window Management in Claude Code: Keep Performance Sharp in Long Sessions
Claude Code's context window holds everything it can "see" — your CLAUDE.md, the files it has read, your conversation history, and its responses. When the context fills up, quality degrades before you hit the hard limit, and the fix is knowing when to prune, redirect, or start fresh. Claude's 200k token window is large, but multi-hour development sessions can consume it. This guide explains what fills the context, how to detect degradation, and the concrete steps to manage it.
What Goes into the Context Window
In a Claude Code session, the context accumulates:
| Source | Tokens (typical) | Notes |
|---|---|---|
| CLAUDE.md | 300-1,000 | Loaded at session start |
| Files read | 500-2,000 per file | Stays in context after reading |
| Your messages | 50-300 per message | Accumulates across turns |
| Claude's responses | 200-2,000 per response | Accumulates across turns |
| Tool output (bash, read) | Variable | Large outputs stay in context |
| System prompt | ~500 | Claude Code's internal instructions |
A typical 2-hour session uses: 40,000-100,000 tokens. A very long session (4+ hours, large codebase, many files read): 150,000-190,000 tokens — approaching the limit.
Signs the Context Is Getting Full
Sign 1: Claude "forgets" earlier instructions
If Claude stops following a rule you established 30 messages ago, the context is likely too full. Earlier messages get less attention as context grows — they're not deleted, but their effective influence decreases.
# Early in session:
You: "Always use the cursor pagination pattern from lib/db/pagination.ts"
# 40 messages later:
Claude generates offset-based pagination instead
→ Context saturation sign
Sign 2: Responses become more generic
Claude starts giving solutions that don't fit your specific codebase — using patterns it knows generally rather than patterns from your files.
Sign 3: Claude asks for information it already has
"Could you share the relevant schema?" when it read schema.ts two hours ago — the file is still technically in context but below the effective attention threshold.
Sign 4: Slower, longer responses
As context fills, the model has more to process per token generated. Noticeable latency increase is a soft signal.
Strategy 1: Proactive Context Pruning
Tell Claude what context is no longer relevant:
We've finished implementing the auth changes. That's complete.
For the rest of this session, focus on the payment module.
The auth-related files and discussion are no longer relevant — don't reference them.
@app/api/webhooks/route.ts @lib/stripe/
Let's now implement the subscription upgrade flow.
This doesn't delete the old context — it's still there — but explicitly directing Claude's attention helps it weight recent, relevant context more heavily.
Strategy 2: Summarize Before Pivoting
At natural pivot points, ask Claude to summarize what's been decided before you redirect:
Before we move to the next feature, give me a summary of:
1. What we implemented in this session
2. Any decisions we made that should be in CLAUDE.md
3. Any unresolved questions or TODOs
I'll use this to update CLAUDE.md, then we'll start fresh for the next feature.
Copy the important decisions into CLAUDE.md, then either start a new session or redirect.
Strategy 3: Fresh Sessions for Distinct Features
The cleanest context management strategy: separate sessions for separate features.
Session 1: Implement user authentication (2 hours)
→ Update CLAUDE.md with any new decisions
→ End session
Session 2: Implement billing (2 hours)
→ Starts with full 200k window
→ CLAUDE.md has all the decisions from Session 1
This is better than one 4-hour session because:
- Full context attention for each feature
- No cross-contamination between feature implementations
- Each session's CLAUDE.md reading is more effective with a fresh window
Strategy 4: Reference Files Instead of Pasting
When you need Claude to see code, reference it rather than pasting:
# Worse — fills context immediately with the full file
Here's our entire user service: [paste 500 lines]
# Better — Claude reads what it needs, can re-read if needed
@lib/services/user.ts — look at the getUserById function specifically
@file references tell Claude to read the file, which it does once and caches. Pasting code puts the full content into your message history permanently.
Strategy 5: Use --max-turns for Bounded Tasks
When running Claude Code for automated or background tasks, limit turns to prevent unbounded context accumulation:
# Limit to 10 turns maximum for this task
claude --print --max-turns 10 \
"Run tests, identify failures, fix the failing tests. Stop after 10 turns."
This prevents runaway sessions that fill the context and loop unnecessarily.
The Checkpoint Pattern
For long development sessions, checkpoint at the end of each logical unit:
Checkpoint: We've completed the user profile feature.
Please do the following:
1. List every file you modified this session
2. Summarize the key decisions and patterns we established
3. List any TODOs or follow-ups we identified
I'll commit these changes and update CLAUDE.md before continuing.
This forces a useful summary, helps you catch anything missed, and gives you a clean break point.
When to Start a New Session vs. Continue
Continue the session when:
- You're in the middle of implementing something coherent
- The files you've loaded are still relevant
- Claude is still following your conventions correctly
- Session is under 2 hours old
Start a new session when:
- You're starting a genuinely new feature unrelated to current context
- Claude has stopped following conventions consistently
- The session has been running 3+ hours
- You've just completed a logical unit (feature, bug fix, refactor sprint)
Quick test: Ask Claude to repeat a rule you established early in the session. If it gets it right, context is fine. If it fumbles or responds generically, start fresh.
Token Budget Estimation
Rough math for planning:
# Rough context budget for a Claude Code session
CLAUDE_MD = 800 # Typical CLAUDE.md
SYSTEM_PROMPT = 500 # Claude Code internals
FILES_LOADED = 1500 * 10 # 10 files × 1,500 tokens each
CONVERSATION = 300 * 30 # 30 turns × 300 tokens average per exchange
total = CLAUDE_MD + SYSTEM_PROMPT + FILES_LOADED + CONVERSATION
# = 800 + 500 + 15,000 + 9,000 = ~25,300 tokens
# Still have ~175,000 tokens available at this point
# At this rate (30 turns, 10 files) you could go ~7x longer
# = about 210 turns or 70 files before nearing limits
For most developers, hitting context limits in a single session is rare unless you're loading large files repeatedly or having very long conversations. The degradation in quality usually matters before the hard limit.
Frequently Asked Questions
What happens when Claude Code hits the context limit? Claude Code will give an error or truncate gracefully, depending on the implementation. Before hitting the hard limit, you'll see quality degradation — Claude ignoring earlier instructions or giving generic responses.
Can I see how many tokens I've used in a session? Not directly in the Claude Code CLI interface. You can estimate based on files loaded and message count using the rough math above. The Anthropic API returns token usage in responses if you're using the SDK directly.
Does starting a new session lose my project context? No — CLAUDE.md persists between sessions. Only the conversation history and dynamically read files are lost. This is why CLAUDE.md is valuable: it's your "save state" that every new session starts with.
Should I keep CLAUDE.md short to save context? Yes, but not at the expense of important information. 300-800 tokens is a good target. Every token in CLAUDE.md is read on every session start — make each token count. Trim generalities; keep specifics.
Does Claude Code compress conversation history automatically? Claude Code does not automatically compress or summarize conversation history. This is intentional — you control what's in context. Use the checkpoint and prune strategies manually.
Related Guides
- Context Engineering for Claude — Designing what goes into context
- Claude Code Complete Guide — All Claude Code features
- Token Counting: Why Your Estimates Are Wrong — Token math deep dive
Go Deeper
Power Prompts 300 — $29 — Includes session management patterns, CLAUDE.md templates for different project types, and checkpoint prompt templates that extract maximum value at each session end.
30-day money-back guarantee. Instant download.