Claude API + Hono Framework: Edge-First Setup (Bun, Cloudflare, 2026)
Hono is the fastest growing JS/TS framework in 2026 β built for edge runtime (Cloudflare Workers, Bun, Deno, Node), with 60ms cold starts and TypeScript-first design. Combined with Claude API, it gives the lowest-latency Claude backend you can deploy: ~80ms from user to first Claude token on edge regions, vs ~250ms for typical Node/Express setups. Free tier on Cloudflare Workers covers 100K requests/day. This guide covers Hono + Claude end-to-end: setup, streaming SSE, tool use, prompt caching, error handling, and deployment to Cloudflare Workers or Bun.
For Claude API basics see Python tutorial. For comparable framework setups see Vercel AI SDK + Claude and FastAPI MCP Server.
Why Hono for Claude API
| Framework | Cold start | Bundle size | Edge support |
|---|---|---|---|
| Hono | ~60ms | 12KB | Cloudflare, Bun, Deno, Node |
| Express | ~250ms | 60KB+ | Node only |
| Fastify | ~150ms | 35KB | Node only |
| Next.js API | ~400ms | varies | Vercel Edge OK |
| Vercel AI SDK | ~150ms | 40KB | Edge yes |
For Claude (latency-sensitive LLM streaming), Hono's edge-first design wins.
Setup (Bun)
mkdir claude-hono && cd claude-hono
bun init -y
bun add hono @anthropic-ai/sdk
src/index.ts:
import { Hono } from "hono";
import { cors } from "hono/cors";
import Anthropic from "@anthropic-ai/sdk";
const app = new Hono<{ Bindings: { ANTHROPIC_API_KEY: string } }>();
app.use("/*", cors());
app.post("/chat", async (c) => {
const { message } = await c.req.json<{ message: string }>();
const client = new Anthropic({ apiKey: c.env.ANTHROPIC_API_KEY });
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
messages: [{ role: "user", content: message }]
});
return c.json({ reply: response.content[0].text });
});
export default app;
bun run --hot src/index.ts
# Server on http://localhost:3000
curl -X POST http://localhost:3000/chat \
-H "Content-Type: application/json" \
-d '{"message":"Hello Claude"}'
Streaming SSE (Server-Sent Events)
import { stream } from "hono/streaming";
app.post("/chat/stream", async (c) => {
const { message } = await c.req.json<{ message: string }>();
const client = new Anthropic({ apiKey: c.env.ANTHROPIC_API_KEY });
return stream(c, async (s) => {
s.writeln("data: " + JSON.stringify({ type: "start" }));
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
messages: [{ role: "user", content: message }],
stream: true
});
for await (const event of response) {
if (event.type === "content_block_delta" &&
event.delta.type === "text_delta") {
s.writeln("data: " + JSON.stringify({
type: "delta",
text: event.delta.text
}));
}
}
s.writeln("data: " + JSON.stringify({ type: "done" }));
});
});
Client side:
const es = new EventSource("/chat/stream", { /* POST hack via lib */ });
es.onmessage = (e) => {
const data = JSON.parse(e.data);
if (data.type === "delta") {
outputElement.textContent += data.text;
}
};
For more streaming patterns see Claude API Streaming Guide.
Tool Use Integration
const tools = [
{
name: "get_weather",
description: "Get current weather for a city",
input_schema: {
type: "object",
properties: {
city: { type: "string", description: "City name" }
},
required: ["city"]
}
}
];
app.post("/agent", async (c) => {
const { message } = await c.req.json<{ message: string }>();
const client = new Anthropic({ apiKey: c.env.ANTHROPIC_API_KEY });
let messages = [{ role: "user" as const, content: message }];
while (true) {
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
tools,
messages
});
if (response.stop_reason === "end_turn") {
return c.json({ reply: response.content[0].text });
}
if (response.stop_reason === "tool_use") {
messages.push({ role: "assistant", content: response.content });
const toolUse = response.content.find(b => b.type === "tool_use");
const result = await executeTool(toolUse!.name, toolUse!.input);
messages.push({
role: "user",
content: [{
type: "tool_result",
tool_use_id: toolUse!.id,
content: JSON.stringify(result)
}]
});
}
}
});
async function executeTool(name: string, args: any) {
if (name === "get_weather") {
// Call your weather API
return { temp: 22, condition: "sunny" };
}
throw new Error(`Unknown tool: ${name}`);
}
For deeper tool use patterns see Claude Tool Use Guide.
Prompt Caching for 90% Savings
System prompts cached at the edge:
app.post("/chat-cached", async (c) => {
const { message } = await c.req.json();
const client = new Anthropic({ apiKey: c.env.ANTHROPIC_API_KEY });
const response = await client.messages.create({
model: "claude-sonnet-4-5",
max_tokens: 1024,
system: [{
type: "text",
text: LONG_SYSTEM_PROMPT, // 5000+ token persona/instructions
cache_control: { type: "ephemeral" }
}],
messages: [{ role: "user", content: message }]
});
return c.json({
reply: response.content[0].text,
cache_read_tokens: response.usage.cache_read_input_tokens || 0
});
});
After first request, cached system prompt cost drops from $0.015 to $0.0015 per call (90% off). See Claude Prompt Caching Guide.
Error Handling
import { HTTPException } from "hono/http-exception";
app.onError((err, c) => {
if (err instanceof Anthropic.APIError) {
if (err.status === 429) {
return c.json({ error: "rate_limited", retry_after: 60 }, 429);
}
if (err.status === 529) {
return c.json({ error: "anthropic_overloaded" }, 503);
}
return c.json({ error: err.message }, err.status);
}
return c.json({ error: "internal_error" }, 500);
});
// Exponential backoff for transient errors
async function callClaudeWithRetry(client: Anthropic, params: any, maxRetries=3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await client.messages.create(params);
} catch (e: any) {
if (e.status === 429 || e.status === 529) {
await new Promise(r => setTimeout(r, Math.pow(2, i) * 1000));
continue;
}
throw e;
}
}
throw new Error("Max retries exceeded");
}
Full production patterns in Claude API Error Handling.
Deploy to Cloudflare Workers
bun add -d wrangler
npx wrangler login
wrangler.toml:
name = "claude-hono"
main = "src/index.ts"
compatibility_date = "2026-05-22"
compatibility_flags = ["nodejs_compat"]
[vars]
# Set ANTHROPIC_API_KEY via:
# npx wrangler secret put ANTHROPIC_API_KEY
npx wrangler secret put ANTHROPIC_API_KEY # paste key when prompted
npx wrangler deploy
# Live at https://claude-hono.<your-subdomain>.workers.dev
Cost on Cloudflare: 100K requests/day free, then $0.30/M requests. For 1M requests/day = ~$9/month for the runtime (Claude API cost is separate).
Frequently Asked Questions
Why Hono over Express for Claude?
Cold start: Hono 60ms vs Express 250ms β matters for serverless/edge. Bundle: 12KB vs 60KB+ β fits in Workers' 1MB script limit easily. Edge support: Hono runs on Cloudflare, Bun, Deno, Vercel Edge. Express is Node-only.
Can I run Hono on Vercel?
Yes. Hono adapter works with Vercel Edge runtime. Use @hono/vercel or directly export app.fetch. Cost: Vercel Hobby is free up to 100K function invocations.
Does streaming work on Cloudflare Workers?
Yes β Workers natively support Web Streams API which Hono uses. SSE works out of the box with stream() helper. Note: Workers has a 30s execution limit by default (free tier) or 5min (paid).
How do I do server-side rendering with Hono?
Hono has @hono/jsx for JSX rendering or use hono/html for template strings. For full SSR see Hono's docs. For Claude-driven content, render Claude's output server-side using the streaming primitive.
Does Hono integrate with Voyage embeddings?
Yes β voyageai SDK works in Hono unchanged. Use Hono for the orchestration layer (RAG retrieval, Claude call), with Voyage for embeddings. Full setup in Claude + Voyage Embeddings Guide.
Master Edge-First Claude API
Claude Agent SDK Cookbook ($79) β 40 production recipes including Hono + Cloudflare Workers, Bun, Deno deployment patterns. Streaming, tool use, caching, error handling, eval.