Secure Claude API Endpoints with JWT Authentication
To secure your Claude API endpoints with JWT, issue RS256-signed tokens from your auth server, verify them in a middleware layer before forwarding requests to Anthropic, and never expose your ANTHROPIC_API_KEY to clients. Each incoming request presents a Bearer token; your middleware validates the signature, checks expiry, and extracts the sub (subject) claim to enforce per-user quotas. Reject requests without a valid token before they ever reach the Claude API — this is the single most important architectural rule for any multi-user Claude proxy.
Why Your Claude Proxy Needs Auth
The Claude API is billed per token to your Anthropic account. If you ship your ANTHROPIC_API_KEY in a frontend app or mobile bundle, anyone who extracts it can run unlimited requests at your expense. A single leaked key has cost teams thousands of dollars in a weekend.
The correct architecture is a proxy layer that sits between your users and Anthropic:
Client → [JWT Bearer token] → Your API Proxy → [ANTHROPIC_API_KEY] → Anthropic
This pattern enforces:
- Authentication: only registered users can call Claude
- Authorization: scoped tokens limit what each user can do
- Quota enforcement: cap tokens consumed per user per day
- Audit trail: every Claude call is attributed to a
subclaim
See Claude API Authentication Setup for the baseline credential configuration before layering JWT on top.
JWT Issuance Flow (RS256)
Use RS256 (asymmetric) rather than HS256 for multi-service architectures. Your auth server holds the private key; any service can verify with the public key without exposing signing secrets.
Generate keys: openssl genrsa -out private.pem 2048 && openssl rsa -in private.pem -pubout -out public.pem
Issue a token (Node.js):
import jwt from 'jsonwebtoken';
const PRIVATE_KEY = fs.readFileSync('./private.pem');
export function issueToken(userId, plan = 'free') {
return jwt.sign(
{ sub: userId, plan, daily_token_limit: plan === 'pro' ? 500_000 : 50_000 },
PRIVATE_KEY,
{ algorithm: 'RS256', expiresIn: '15m',
issuer: 'auth.yourapp.com', audience: 'claude-proxy.yourapp.com' }
);
}
Refresh tokens are opaque random bytes: crypto.randomBytes(40).toString('hex'). Store only the hash in DB with a 7-day expiry and a rotated flag.
Express JWT Verification Middleware
// middleware/verifyJwt.js
import jwt from 'jsonwebtoken';
import fs from 'fs';
const PUBLIC_KEY = fs.readFileSync('./public.pem');
export function verifyJwt(req, res, next) {
const authHeader = req.headers.authorization;
if (!authHeader?.startsWith('Bearer ')) {
return res.status(401).json({ error: 'Missing Bearer token' });
}
const token = authHeader.slice(7);
try {
const payload = jwt.verify(token, PUBLIC_KEY, {
algorithms: ['RS256'],
issuer: 'auth.yourapp.com',
audience: 'claude-proxy.yourapp.com',
});
req.user = payload; // { sub, plan, daily_token_limit, iat, exp }
next();
} catch (err) {
if (err.name === 'TokenExpiredError') {
return res.status(401).json({ error: 'Token expired', code: 'TOKEN_EXPIRED' });
}
return res.status(403).json({ error: 'Invalid token' });
}
}
Apply to your Claude proxy route:
const anthropic = new Anthropic(); // reads ANTHROPIC_API_KEY from env
app.post('/api/claude', verifyJwt, async (req, res) => {
const { sub, daily_token_limit } = req.user;
if (!(await checkQuota(sub, daily_token_limit)))
return res.status(429).json({ error: 'Daily token quota exceeded' });
const response = await anthropic.messages.create({
model: 'claude-sonnet-4-5', max_tokens: 1024, messages: req.body.messages,
});
await recordUsage(sub, response.usage.input_tokens + response.usage.output_tokens);
res.json(response);
});
Production-ready Claude proxy patterns
Agent SDK Cookbook ($49) includes complete proxy blueprints with JWT auth, streaming middleware, quota management, and multi-tenant rate limiting — production-tested recipes you can drop into your stack.
FastAPI JWT Verification Middleware
# middleware/verify_jwt.py
from fastapi import Depends, HTTPException
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from jose import jwt, JWTError
PUBLIC_KEY = open("public.pem").read()
bearer_scheme = HTTPBearer()
def verify_jwt(credentials: HTTPAuthorizationCredentials = Depends(bearer_scheme)):
try:
return jwt.decode(
credentials.credentials,
PUBLIC_KEY,
algorithms=["RS256"],
issuer="auth.yourapp.com",
audience="claude-proxy.yourapp.com",
)
except JWTError as e:
raise HTTPException(status_code=403, detail=f"Invalid token: {e}")
FastAPI proxy route (same quota + usage pattern as Express):
@app.post("/api/claude")
async def claude_proxy(req: ChatRequest, user=Depends(verify_jwt)):
if not await check_quota(user["sub"], user.get("daily_token_limit", 50_000)):
raise HTTPException(status_code=429, detail="Daily token quota exceeded")
response = client.messages.create(model="claude-sonnet-4-5", max_tokens=1024,
messages=req.messages)
await record_usage(user["sub"], response.usage.input_tokens + response.usage.output_tokens)
return response.model_dump()
Refresh Token Rotation
Short-lived access tokens (15 minutes) limit exposure if stolen. Pair them with rotating refresh tokens (7 days) to maintain seamless UX. Each refresh token can be used exactly once — on use, invalidate it and issue a new pair. If the same refresh token is presented twice, treat it as theft and revoke the entire family for that user.
// POST /auth/refresh
app.post('/auth/refresh', async (req, res) => {
const record = await db.refreshTokens.findByTokenHash(hash(req.body.refreshToken));
if (!record || record.rotated || record.expiresAt < Date.now()) {
if (record) await db.refreshTokens.revokeAllForUser(record.userId);
return res.status(401).json({ error: 'Invalid or reused refresh token' });
}
await db.refreshTokens.markRotated(record.id);
res.json({
accessToken: issueToken(record.userId, record.plan),
refreshToken: issueRefreshToken(record.userId),
});
});
Scoping Tokens to User Quotas
Encode resource limits in the token payload to avoid a DB round-trip on every request:
// Free tier
{ sub: userId, plan: 'free', daily_token_limit: 50_000, model_access: ['haiku'] }
// Pro tier
{ sub: userId, plan: 'pro', daily_token_limit: 500_000, model_access: ['haiku', 'sonnet'] }
Enforce model access in middleware: reject if req.body.model is not in the token's model_access array. For detailed quota strategies and token counting, see Claude API Rate Limits 2026.
Rate Limiting Per JWT Subject
Token-level rate limiting prevents a single user from bursting and triggering Anthropic's upstream limits. Use Redis for atomic per-minute counters across proxy instances:
// rateLimit.js — fixed window per JWT sub
export async function rateLimitBySubject(sub, maxRpm = 20) {
const key = `rl:${sub}:${Math.floor(Date.now() / 60_000)}`;
const count = await redis.incr(key);
if (count === 1) await redis.expire(key, 120);
if (count > maxRpm) {
throw { status: 429, message: `Rate limit: ${maxRpm} req/min` };
}
}
// In your proxy route, call before forwarding to Anthropic:
app.post('/api/claude', verifyJwt, async (req, res) => {
try {
await rateLimitBySubject(req.user.sub, req.user.plan === 'pro' ? 60 : 20);
} catch (e) {
return res.status(e.status).json({ error: e.message });
}
// ... Anthropic call
});
Also review Claude Code Security Scanning to catch secrets and auth misconfigurations before they reach production.
Common Attack Patterns
| Attack | Risk | Mitigation |
|---|---|---|
| Token theft via XSS | Attacker exfiltrates localStorage JWT |
Store access tokens in memory; refresh tokens in HttpOnly cookies |
| Replay attack | Stolen valid token reused | Short expiry (15 min) + Redis deny-list on logout |
Missing exp claim |
Token valid forever | Pin expiresIn at issuance; reject tokens without exp |
Algorithm confusion (alg: none) |
Signature bypass | Always pin algorithms: ['RS256'] in jwt.verify |
| Weak HS256 secret | Offline brute-force | Use crypto.randomBytes(32).toString('hex') or switch to RS256 |
30+ production Claude proxy recipes
Agent SDK Cookbook ($49) covers complete JWT auth flows, multi-tenant quota management, streaming proxy patterns, and attack mitigation checklists — everything you need to ship a secure Claude integration.
Frequently Asked Questions
Why use RS256 instead of HS256 for Claude proxy auth?
RS256 uses a public/private key pair. The auth server signs with the private key; downstream services verify with the public key without touching the signing secret. HS256 requires sharing the same secret everywhere — expanding your attack surface. For any multi-service architecture, RS256 is the safer default.
How short should my access token expiry be?
15 minutes is the industry standard for high-value APIs. Pair it with a 7-day rotating refresh token so users re-authenticate silently. Never issue access tokens valid for more than 24 hours to a Claude proxy — the cost exposure if stolen is too high.
How do I revoke a JWT before it expires?
JWTs are stateless — you cannot "un-sign" them. For immediate revocation (logout or compromise), maintain a Redis deny-list keyed by jti (JWT ID). Check it in middleware after signature verification. Short expiry (15 min) limits how long you need to retain revoked token IDs.
Should I put the Anthropic API key in the JWT payload?
Never. The JWT payload is base64-encoded and readable by anyone holding the token. Keep ANTHROPIC_API_KEY in server-side environment variables only. The JWT carries identity claims (sub, plan, quota limits); the proxy injects the Anthropic key from process.env.
What happens when my public key rotates?
Keep the old public key available until all tokens signed with it have expired (15 min). A JWKS endpoint (/.well-known/jwks.json) lets services fetch current keys automatically, supporting zero-downtime rotation.