Context Window Management in Claude Code: Keep Performance Sharp in Long Sessions

Claude Code's context window holds everything it can "see" — your CLAUDE.md, the files it has read, your conversation history, and its responses. When the context fills up, quality degrades before you hit the hard limit, and the fix is knowing when to prune, redirect, or start fresh. Claude's 200k token window is large, but multi-hour development sessions can consume it. This guide explains what fills the context, how to detect degradation, and the concrete steps to manage it.

What Goes into the Context Window

In a Claude Code session, the context accumulates:

Source	Tokens (typical)	Notes
CLAUDE.md	300-1,000	Loaded at session start
Files read	500-2,000 per file	Stays in context after reading
Your messages	50-300 per message	Accumulates across turns
Claude's responses	200-2,000 per response	Accumulates across turns
Tool output (bash, read)	Variable	Large outputs stay in context
System prompt	~500	Claude Code's internal instructions

A typical 2-hour session uses: 40,000-100,000 tokens. A very long session (4+ hours, large codebase, many files read): 150,000-190,000 tokens — approaching the limit.

Signs the Context Is Getting Full

Sign 1: Claude "forgets" earlier instructions

If Claude stops following a rule you established 30 messages ago, the context is likely too full. Earlier messages get less attention as context grows — they're not deleted, but their effective influence decreases.

# Early in session:
You: "Always use the cursor pagination pattern from lib/db/pagination.ts"

# 40 messages later:
Claude generates offset-based pagination instead

→ Context saturation sign

Sign 2: Responses become more generic

Claude starts giving solutions that don't fit your specific codebase — using patterns it knows generally rather than patterns from your files.

Sign 3: Claude asks for information it already has

"Could you share the relevant schema?" when it read schema.ts two hours ago — the file is still technically in context but below the effective attention threshold.

Sign 4: Slower, longer responses

As context fills, the model has more to process per token generated. Noticeable latency increase is a soft signal.

Strategy 1: Proactive Context Pruning

Tell Claude what context is no longer relevant:

We've finished implementing the auth changes. That's complete.
For the rest of this session, focus on the payment module.
The auth-related files and discussion are no longer relevant — don't reference them.

@app/api/webhooks/route.ts @lib/stripe/

Let's now implement the subscription upgrade flow.

This doesn't delete the old context — it's still there — but explicitly directing Claude's attention helps it weight recent, relevant context more heavily.

Strategy 2: Summarize Before Pivoting

At natural pivot points, ask Claude to summarize what's been decided before you redirect:

Before we move to the next feature, give me a summary of:
1. What we implemented in this session
2. Any decisions we made that should be in CLAUDE.md
3. Any unresolved questions or TODOs

I'll use this to update CLAUDE.md, then we'll start fresh for the next feature.

Copy the important decisions into CLAUDE.md, then either start a new session or redirect.

Strategy 3: Fresh Sessions for Distinct Features

The cleanest context management strategy: separate sessions for separate features.

Session 1: Implement user authentication (2 hours)
→ Update CLAUDE.md with any new decisions
→ End session

Session 2: Implement billing (2 hours)
→ Starts with full 200k window
→ CLAUDE.md has all the decisions from Session 1

This is better than one 4-hour session because:

Full context attention for each feature
No cross-contamination between feature implementations
Each session's CLAUDE.md reading is more effective with a fresh window

Strategy 4: Reference Files Instead of Pasting

When you need Claude to see code, reference it rather than pasting:

# Worse — fills context immediately with the full file
Here's our entire user service: [paste 500 lines]

# Better — Claude reads what it needs, can re-read if needed
@lib/services/user.ts — look at the getUserById function specifically

@file references tell Claude to read the file, which it does once and caches. Pasting code puts the full content into your message history permanently.

Strategy 5: Use `--max-turns` for Bounded Tasks

When running Claude Code for automated or background tasks, limit turns to prevent unbounded context accumulation:

# Limit to 10 turns maximum for this task
claude --print --max-turns 10 \
  "Run tests, identify failures, fix the failing tests. Stop after 10 turns."

This prevents runaway sessions that fill the context and loop unnecessarily.

The Checkpoint Pattern

For long development sessions, checkpoint at the end of each logical unit:

Checkpoint: We've completed the user profile feature.
Please do the following:
1. List every file you modified this session
2. Summarize the key decisions and patterns we established
3. List any TODOs or follow-ups we identified

I'll commit these changes and update CLAUDE.md before continuing.

This forces a useful summary, helps you catch anything missed, and gives you a clean break point.

When to Start a New Session vs. Continue

Continue the session when:

You're in the middle of implementing something coherent
The files you've loaded are still relevant
Claude is still following your conventions correctly
Session is under 2 hours old

Start a new session when:

You're starting a genuinely new feature unrelated to current context
Claude has stopped following conventions consistently
The session has been running 3+ hours
You've just completed a logical unit (feature, bug fix, refactor sprint)

Quick test: Ask Claude to repeat a rule you established early in the session. If it gets it right, context is fine. If it fumbles or responds generically, start fresh.

Token Budget Estimation

Rough math for planning:

# Rough context budget for a Claude Code session

CLAUDE_MD = 800          # Typical CLAUDE.md
SYSTEM_PROMPT = 500      # Claude Code internals
FILES_LOADED = 1500 * 10 # 10 files × 1,500 tokens each
CONVERSATION = 300 * 30  # 30 turns × 300 tokens average per exchange

total = CLAUDE_MD + SYSTEM_PROMPT + FILES_LOADED + CONVERSATION
# = 800 + 500 + 15,000 + 9,000 = ~25,300 tokens

# Still have ~175,000 tokens available at this point
# At this rate (30 turns, 10 files) you could go ~7x longer
# = about 210 turns or 70 files before nearing limits

For most developers, hitting context limits in a single session is rare unless you're loading large files repeatedly or having very long conversations. The degradation in quality usually matters before the hard limit.

Frequently Asked Questions

What happens when Claude Code hits the context limit? Claude Code will give an error or truncate gracefully, depending on the implementation. Before hitting the hard limit, you'll see quality degradation — Claude ignoring earlier instructions or giving generic responses.

Can I see how many tokens I've used in a session? Not directly in the Claude Code CLI interface. You can estimate based on files loaded and message count using the rough math above. The Anthropic API returns token usage in responses if you're using the SDK directly.

Does starting a new session lose my project context? No — CLAUDE.md persists between sessions. Only the conversation history and dynamically read files are lost. This is why CLAUDE.md is valuable: it's your "save state" that every new session starts with.

Should I keep CLAUDE.md short to save context? Yes, but not at the expense of important information. 300-800 tokens is a good target. Every token in CLAUDE.md is read on every session start — make each token count. Trim generalities; keep specifics.

Does Claude Code compress conversation history automatically? Claude Code does not automatically compress or summarize conversation history. This is intentional — you control what's in context. Use the checkpoint and prune strategies manually.

Related Guides

Context Engineering for Claude — Designing what goes into context
Claude Code Complete Guide — All Claude Code features
Token Counting: Why Your Estimates Are Wrong — Token math deep dive

Go Deeper

Power Prompts 300 — $29 — Includes session management patterns, CLAUDE.md templates for different project types, and checkpoint prompt templates that extract maximum value at each session end.

→ Get Power Prompts 300 — $29

30-day money-back guarantee. Instant download.