Claude API Discord Bot: Complete Guide (2026)

Q: What Discord bot permissions do I need to enable Claude API integration?

In the Discord Developer Portal, enable the Message Content Intent under Bot settings — this is required to read message content when the bot is mentioned. For slash commands only, you do not need message content intent. Grant the bot Send Messages, Read Message History, and Use Application Commands permissions when inviting to a server.

Q: How do I prevent my bot from running up a large API bill?

Implement rate limiting per user and per guild before calling the Claude API (see the limiter example above). Set a max_tokens cap (1024 is usually enough for Discord replies). Use Claude Haiku for most requests — it costs 10x less than Sonnet for comparable quality on conversational tasks. Set a monthly budget alert in the Anthropic Console.

Q: Can I store conversation history in a database instead of memory?

Yes. Replace the in-memory Map in src/context/store.ts with database calls. Use a conversations table keyed by userId, storing the history as a JSON column. PostgreSQL, SQLite (via better-sqlite3), or Redis all work well. Database-backed context survives bot restarts and works across multiple bot instances.

Q: How do I handle Discord's 2000-character message limit with Claude responses?

Set a hard limit in your system prompt ("Never exceed 1900 characters") and truncate defensively in code before sending. For longer responses, split the text at sentence boundaries and send as multiple messages using message.reply() followed by message.channel.send(). Alternatively, send a follow-up embed or file attachment for very long output.

Q: Should I use claude-haiku or claude-sonnet for a Discord bot?

Start with claude-haiku-4-5 for all requests. Haiku handles conversational Q&A, code snippets, and explanations well at 1/10 the cost of Sonnet. Upgrade to claude-sonnet-4-5 only if your use case requires complex reasoning, long document analysis, or nuanced writing. For model selection guidance, see Claude Haiku vs Sonnet vs Opus: Which Model.

To build a Discord bot powered by Claude API, install discord.js v14 and @anthropic-ai/sdk, register slash commands with the Discord REST API, then handle interactions by forwarding user messages to anthropic.messages.create() and replying with the response. The full setup takes about 30 minutes. This guide covers slash commands, message event handlers, per-user conversation context, rate limiting per server and user, and deployment with PM2 or Docker.

discord.js v14 Setup

Initialize a new project and install dependencies:

mkdir claude-discord-bot && cd claude-discord-bot
npm init -y
npm install discord.js @anthropic-ai/sdk dotenv
npm install -D typescript ts-node @types/node
npx tsc --init

Create a .env file:

DISCORD_TOKEN=your-bot-token
DISCORD_CLIENT_ID=your-application-client-id
ANTHROPIC_API_KEY=sk-ant-...

Create src/index.ts — the bot entry point:

import { Client, GatewayIntentBits, Partials } from "discord.js";
import * as dotenv from "dotenv";
import { registerCommands } from "./commands/register";
import { handleInteraction } from "./handlers/interaction";
import { handleMessage } from "./handlers/message";

dotenv.config();

const client = new Client({
  intents: [
    GatewayIntentBits.Guilds,
    GatewayIntentBits.GuildMessages,
    GatewayIntentBits.MessageContent,
    GatewayIntentBits.DirectMessages,
  ],
  partials: [Partials.Channel],
});

client.once("ready", async () => {
  console.log(`Logged in as ${client.user?.tag}`);
  await registerCommands();
});

client.on("interactionCreate", handleInteraction);
client.on("messageCreate", handleMessage);

client.login(process.env.DISCORD_TOKEN);

Slash Commands with Claude

// src/commands/register.ts
import { REST, Routes, SlashCommandBuilder } from "discord.js";

const commands = [
  new SlashCommandBuilder()
    .setName("ask")
    .setDescription("Ask Claude a question")
    .addStringOption((opt) =>
      opt.setName("question").setDescription("Your question").setRequired(true)
    ),
  new SlashCommandBuilder()
    .setName("reset")
    .setDescription("Clear your conversation history with Claude"),
].map((cmd) => cmd.toJSON());

export async function registerCommands() {
  const rest = new REST({ version: "10" }).setToken(process.env.DISCORD_TOKEN!);

  await rest.put(
    Routes.applicationCommands(process.env.DISCORD_CLIENT_ID!),
    { body: commands }
  );

  console.log("Slash commands registered.");
}

Handle the /ask interaction and call Claude:

// src/handlers/interaction.ts
import { ChatInputCommandInteraction, Interaction } from "discord.js";
import Anthropic from "@anthropic-ai/sdk";
import { getConversation, addTurn, clearConversation } from "../context/store";
import { checkRateLimit } from "../ratelimit/limiter";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

export async function handleInteraction(interaction: Interaction) {
  if (!interaction.isChatInputCommand()) return;

  if (interaction.commandName === "reset") {
    clearConversation(interaction.user.id);
    await interaction.reply({ content: "Conversation cleared.", ephemeral: true });
    return;
  }

  if (interaction.commandName === "ask") {
    await handleAsk(interaction);
  }
}

async function handleAsk(interaction: ChatInputCommandInteraction) {
  const question = interaction.options.getString("question", true);
  const userId = interaction.user.id;
  const guildId = interaction.guildId ?? "dm";

  // Check rate limit before calling Claude
  const allowed = checkRateLimit(userId, guildId);
  if (!allowed) {
    await interaction.reply({
      content: "You have hit the rate limit. Please wait a moment.",
      ephemeral: true,
    });
    return;
  }

  // Defer reply — Claude may take >3 seconds
  await interaction.deferReply();

  const history = getConversation(userId);
  history.push({ role: "user", content: question });

  const response = await anthropic.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 1024,
    system: "You are a helpful assistant in a Discord server. Keep responses concise and formatted for Discord (use markdown where helpful). Never exceed 1900 characters.",
    messages: history,
  });

  const reply = response.content[0].type === "text" ? response.content[0].text : "";
  history.push({ role: "assistant", content: reply });
  addTurn(userId, history);

  // Discord messages have a 2000-character limit
  const truncated = reply.length > 1900 ? reply.slice(0, 1897) + "..." : reply;
  await interaction.editReply(truncated);
}

Message Event Handlers

For bots that respond when mentioned (not slash commands only):

// src/handlers/message.ts
import { Message } from "discord.js";
import Anthropic from "@anthropic-ai/sdk";
import { getConversation, addTurn } from "../context/store";
import { checkRateLimit } from "../ratelimit/limiter";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY });

export async function handleMessage(message: Message) {
  // Ignore bots, ignore messages that don't mention this bot
  if (message.author.bot) return;
  if (!message.mentions.users.has(message.client.user!.id)) return;

  const userId = message.author.id;
  const guildId = message.guildId ?? "dm";

  if (!checkRateLimit(userId, guildId)) {
    await message.reply("Rate limit reached. Please wait before sending more messages.");
    return;
  }

  // Strip the bot mention from the message content
  const userText = message.content
    .replace(/<@!?\d+>/g, "")
    .trim();

  if (!userText) {
    await message.reply("What would you like to ask?");
    return;
  }

  const history = getConversation(userId);
  history.push({ role: "user", content: userText });

  const typing = message.channel.sendTyping();

  const response = await anthropic.messages.create({
    model: "claude-haiku-4-5",
    max_tokens: 1024,
    system: "You are a helpful Discord assistant. Be concise. Use Discord markdown.",
    messages: history,
  });

  await typing;

  const reply = response.content[0].type === "text" ? response.content[0].text : "";
  history.push({ role: "assistant", content: reply });
  addTurn(userId, history);

  const truncated = reply.length > 1900 ? reply.slice(0, 1897) + "..." : reply;
  await message.reply(truncated);
}

Conversation Context Management

Store per-user conversation history in memory with a rolling window to control token usage:

// src/context/store.ts
import type { MessageParam } from "@anthropic-ai/sdk/resources/messages";

const MAX_TURNS = 10; // keep last 10 exchanges
const TTL_MS = 30 * 60 * 1000; // 30-minute idle expiry

interface Session {
  history: MessageParam[];
  lastActive: number;
}

const sessions = new Map<string, Session>();

export function getConversation(userId: string): MessageParam[] {
  const session = sessions.get(userId);
  if (!session) return [];

  // Expire stale sessions
  if (Date.now() - session.lastActive > TTL_MS) {
    sessions.delete(userId);
    return [];
  }

  return [...session.history];
}

export function addTurn(userId: string, history: MessageParam[]) {
  // Trim to rolling window: keep last MAX_TURNS pairs (user + assistant = 2 entries each)
  const trimmed = history.slice(-MAX_TURNS * 2);
  sessions.set(userId, { history: trimmed, lastActive: Date.now() });
}

export function clearConversation(userId: string) {
  sessions.delete(userId);
}

Token cost data point: With MAX_TURNS = 10, average conversation context is ~1,200 input tokens per request. Without windowing, a 50-message thread would send ~6,000 tokens of context on every call — a 5x cost increase. See Claude API Cost and Prompt Caching Break-Even for how to use prompt caching to reduce repeat context costs by up to 90%.

Ship production Discord bots faster

Agent SDK Cookbook ($49) includes 30+ production recipes for Claude integrations: conversation context patterns, tool use chains, streaming handlers, multi-agent coordination, and deployment templates — including a full Discord bot starter project.

Get Agent SDK Cookbook — $49

Rate Limiting Per Server and User

Protect your API budget with a token-bucket rate limiter:

// src/ratelimit/limiter.ts

interface Bucket {
  tokens: number;
  lastRefill: number;
}

// Per-user: 5 requests per minute
// Per-guild: 30 requests per minute
const USER_LIMIT = 5;
const GUILD_LIMIT = 30;
const REFILL_INTERVAL_MS = 60_000;

const userBuckets = new Map<string, Bucket>();
const guildBuckets = new Map<string, Bucket>();

function consume(map: Map<string, Bucket>, key: string, maxTokens: number): boolean {
  const now = Date.now();
  let bucket = map.get(key);

  if (!bucket) {
    bucket = { tokens: maxTokens, lastRefill: now };
    map.set(key, bucket);
  }

  // Refill tokens if interval has passed
  const elapsed = now - bucket.lastRefill;
  if (elapsed >= REFILL_INTERVAL_MS) {
    bucket.tokens = maxTokens;
    bucket.lastRefill = now;
  }

  if (bucket.tokens <= 0) return false;

  bucket.tokens -= 1;
  return true;
}

export function checkRateLimit(userId: string, guildId: string): boolean {
  const userAllowed = consume(userBuckets, userId, USER_LIMIT);
  const guildAllowed = consume(guildBuckets, guildId, GUILD_LIMIT);
  return userAllowed && guildAllowed;
}

For multi-process deployments, replace the in-memory map with Redis (ioredis) so limits are shared across instances.

Cost Analysis by Server Size

The following estimates use Claude Haiku (input: $0.80/MTok, output: $4.00/MTok) with an average of 200 input tokens and 300 output tokens per request:

Server Size	Daily Active Users	Requests/Day	Input Cost/Month	Output Cost/Month	Total/Month
Small (< 100 members)	10	50	$0.02	$0.18	~$0.20
Medium (100-1,000)	50	300	$0.14	$1.08	~$1.22
Large (1,000-10,000)	200	1,500	$0.72	$5.40	~$6.12
Community (10,000+)	1,000	8,000	$3.84	$28.80	~$32.64

Model routing tip: Route simple factual queries to Haiku and escalate complex reasoning to Sonnet only when needed. A 70/30 split (Haiku/Sonnet) keeps large server costs under $50/month while maintaining quality. See Claude Haiku vs Sonnet vs Opus: Which Model for benchmark comparisons.

Prompt caching is especially valuable in Discord bots. Cache your system prompt and any static knowledge base content — on a server with 1,000 daily requests, caching cuts input token costs by up to 90%.

Deployment with PM2 or Docker

PM2 Deployment

PM2 keeps your bot alive, restarts on crash, and manages logs:

npm install -g pm2
npx tsc
pm2 start dist/index.js --name claude-bot
pm2 save
pm2 startup  # generate systemd/launchd startup script

Monitor with pm2 logs claude-bot and pm2 monit.

Docker Deployment

# Dockerfile
FROM node:20-alpine
WORKDIR /app

COPY package*.json ./
RUN npm ci --omit=dev

COPY tsconfig.json ./
COPY src ./src
RUN npx tsc

ENV NODE_ENV=production
CMD ["node", "dist/index.js"]

# docker-compose.yml
version: "3.9"
services:
  claude-bot:
    build: .
    restart: unless-stopped
    env_file: .env
    environment:
      - NODE_ENV=production

Deploy with docker compose up -d. For Redis-backed rate limiting, add a Redis service to the Compose file and update the limiter to use ioredis.

For Slack bot patterns using the same Claude API integration approach, see Claude API Slack Bot Guide.

Want the complete production starter kit?

Agent SDK Cookbook ($49) includes the full Claude-powered Discord bot template: TypeScript, slash commands, context management, rate limiting, Redis support, and Docker config — plus 29 more production recipes for building with the Anthropic SDK.

Get Agent SDK Cookbook — $49

Frequently Asked Questions

What Discord bot permissions do I need to enable Claude API integration?

In the Discord Developer Portal, enable the Message Content Intent under Bot settings — this is required to read message content when the bot is mentioned. For slash commands only, you do not need message content intent. Grant the bot Send Messages, Read Message History, and Use Application Commands permissions when inviting to a server.

How do I prevent my bot from running up a large API bill?

Implement rate limiting per user and per guild before calling the Claude API (see the limiter example above). Set a max_tokens cap (1024 is usually enough for Discord replies). Use Claude Haiku for most requests — it costs 10x less than Sonnet for comparable quality on conversational tasks. Set a monthly budget alert in the Anthropic Console.

Can I store conversation history in a database instead of memory?

Yes. Replace the in-memory Map in src/context/store.ts with database calls. Use a conversations table keyed by userId, storing the history as a JSON column. PostgreSQL, SQLite (via better-sqlite3), or Redis all work well. Database-backed context survives bot restarts and works across multiple bot instances.

How do I handle Discord's 2000-character message limit with Claude responses?

Set a hard limit in your system prompt ("Never exceed 1900 characters") and truncate defensively in code before sending. For longer responses, split the text at sentence boundaries and send as multiple messages using message.reply() followed by message.channel.send(). Alternatively, send a follow-up embed or file attachment for very long output.

Should I use claude-haiku or claude-sonnet for a Discord bot?

Start with claude-haiku-4-5 for all requests. Haiku handles conversational Q&A, code snippets, and explanations well at 1/10 the cost of Sonnet. Upgrade to claude-sonnet-4-5 only if your use case requires complex reasoning, long document analysis, or nuanced writing. For model selection guidance, see Claude Haiku vs Sonnet vs Opus: Which Model.