โ† All guides

Claude + Voyage AI Embeddings: The Anthropic-Recommended Stack (2026)

Voyage AI is Anthropic's recommended embeddings provider for Claude RAG. voyage-3 outperforms OpenAI at 50% cost. Setup, models, multilingual, reranker.

Claude + Voyage AI Embeddings: The Anthropic-Recommended Stack (2026)

Voyage AI is the embedding provider Anthropic officially recommends for Claude RAG pipelines โ€” voyage-3 scores 65.4 on MTEB (vs OpenAI text-embedding-3-large at 64.6) and costs $0.06/M tokens (vs OpenAI at $0.13/M), making it ~50% cheaper at higher quality. Voyage also ships rerank-2 ($0.05/M query+doc tokens) and voyage-code-2 (specialized for code search). This guide covers when to choose Voyage over alternatives, all 4 model variants, multilingual handling, the reranking workflow, and migration from OpenAI embeddings.

For Claude + Pinecone RAG end-to-end see Claude + Pinecone Vector DB. For embeddings-vs-search alternatives see Claude API Semantic Search.


Voyage AI vs Alternatives (Benchmark)

Provider Model Cost/M tokens MTEB Dims
Voyage AI voyage-3 $0.06 65.4 1024
Voyage AI voyage-3-large $0.18 67.2 2048
OpenAI text-embedding-3-large $0.13 64.6 3072
OpenAI text-embedding-3-small $0.02 62.3 1536
Cohere embed-v3 $0.10 64.5 1024
Open source BGE-large-en-v1.5 $0 (self-host) 64.2 1024

voyage-3 is the new default for Claude RAG: best quality-per-dollar, Anthropic-blessed.


Setup (60 seconds)

pip install voyageai
# or
bun add voyageai

Get an API key at https://voyageai.com (Anthropic Console users get a Voyage credit).

import voyageai
vo = voyageai.Client()  # uses VOYAGE_API_KEY env var

# Embed documents
docs = ["Claude is a powerful LLM", "Voyage makes embeddings"]
result = vo.embed(docs, model="voyage-3", input_type="document")
embeddings = result.embeddings  # list of 1024-dim vectors

That's it. Now feed into Pinecone, pgvector, Qdrant, or any vector DB.


Choose the Right Model

voyage-3 (recommended default)

voyage-3-large (when quality > cost)

voyage-code-2 (code search)

rerank-2 (the reranking complement)


The Two-Stage Pattern (Industry Standard)

Vector search alone has ~70% top-5 accuracy. Adding rerank pushes it to ~90%+.

import voyageai

vo = voyageai.Client()

def search_with_rerank(query: str, vector_db, top_k=5):
    # Stage 1: cheap vector search retrieves 20 candidates
    query_emb = vo.embed([query], model="voyage-3",
                          input_type="query").embeddings[0]
    candidates = vector_db.query(query_emb, top_k=20)

    # Stage 2: expensive rerank picks the best 5
    docs = [c["text"] for c in candidates]
    reranked = vo.rerank(query=query, documents=docs,
                         model="rerank-2", top_k=top_k)

    return [
        {**candidates[r.index], "rerank_score": r.relevance_score}
        for r in reranked.results
    ]

Cost per query (50K-doc index):


Multilingual: Korean + English Together

voyage-3 handles mixed-language corpora natively:

docs = [
    "Claude is Anthropic's AI assistant",
    "Claude๋Š” Anthropic์˜ AI ์–ด์‹œ์Šคํ„ดํŠธ์ž…๋‹ˆ๋‹ค",
    "Cloudeใฏ Anthropicใฎ AIใ‚ขใ‚ทใ‚นใ‚ฟใƒณใƒˆใงใ™"
]
embs = vo.embed(docs, model="voyage-3", input_type="document").embeddings
# Query in any language retrieves all three

No language detection or routing needed โ€” embed once, search across languages. See Korean Prompt Engineering for Korean-specific Claude patterns.


Input Type: document vs query

Voyage requires you to specify whether text is being embedded as a document (stored) or query (search):

# When embedding for storage
docs_emb = vo.embed(documents, model="voyage-3", input_type="document").embeddings

# When embedding a search query
query_emb = vo.embed([user_query], model="voyage-3", input_type="query").embeddings[0]

Different optimization paths. Skip this distinction โ†’ 5-10% accuracy loss.


Migration from OpenAI Embeddings

If you're moving from OpenAI text-embedding-3-large:

# OLD
import openai
client = openai.OpenAI()
emb = client.embeddings.create(
    input=text,
    model="text-embedding-3-large"
).data[0].embedding  # 3072-dim

# NEW (Voyage)
import voyageai
vo = voyageai.Client()
emb = vo.embed([text], model="voyage-3",
               input_type="document").embeddings[0]  # 1024-dim

Dimension difference matters: must re-embed all docs in your vector DB. The migration:

  1. Create new index with dimension=1024
  2. Re-embed all docs with voyage-3 (one-time cost: ~$2 per 1M docs)
  3. Cutover queries to new index
  4. Delete old index

For 1M doc dataset: ~$2 + ~1 hour. Quality improves, cost drops 50%.


Reranking Without Voyage Vectors

You can use voyage rerank-2 on TOP of OpenAI/Cohere vector search:

# Use OpenAI for vector search (existing infrastructure)
candidates = openai_vector_db.query(query_emb, top_k=20)

# Use Voyage for reranking (additive โ€” no migration)
reranked = vo.rerank(query=query, documents=[c["text"] for c in candidates],
                     model="rerank-2", top_k=5)

Cheapest way to get a 20% accuracy boost on existing RAG: add voyage rerank-2 on top.


Cost at Scale

Scale One-time embed Monthly query (50K queries)
10K docs $0.30 $0.50
100K docs $3 $0.50
1M docs $30 $2
10M docs $300 $20

Embedding cost is a one-time write expense. Query cost scales with traffic, not corpus size.


Frequently Asked Questions

Can I use Claude as an embedding model?

No. Claude is a generation model โ€” it doesn't expose hidden states or embeddings via API. Use a dedicated embedding model (Voyage, OpenAI, Cohere, or open-source).

Should I use voyage-3 or voyage-3-large?

Start with voyage-3 ($0.06/M). Switch to voyage-3-large ($0.18/M) only if you measure a meaningful quality gap on your eval set. For most RAG, voyage-3 is sufficient.

Does Voyage support Korean as well as English?

Yes. voyage-3 is multilingual with strong performance across Korean, Japanese, Chinese, and major European languages. Embed mixed-language corpora directly without translation.

How does reranking compare to using a larger embedding model?

Reranking with a small model + rerank-2 typically beats using a large embedding model alone on retrieval quality. The cost is also lower: vector search remains cheap, reranking targets only top-K. See production-ready pipeline in Claude + Pinecone Vector RAG.

Is there a free tier?

Voyage AI offers $50 in free credits on signup, which is enough to embed ~800M tokens or rerank ~1M queries. Anthropic Console users get additional credits.


Master Production Claude API RAG

Claude Agent SDK Cookbook ($79) โ€” 40 production recipes including RAG with Voyage embeddings + Pinecone, pgvector, Qdrant. Eval, cost optimization, and security patterns included.

AI Disclosure: Drafted with Claude Code; benchmarks from MTEB leaderboard and Voyage AI documentation May 2026.

Tools and references