Which is the best embedding model in 2026?

There is no single winner — it depends on your budget and language. On the English MTEB leaderboard, open-weight models like gte-large-en-v1.5 and bge-large-en-v1.5 top the quality column and cost nothing in API fees if you can self-host. For a hosted API with no GPU, OpenAI's text-embedding-3-large gives the best proprietary quality, while text-embedding-3-small and voyage-3-lite are the cheapest at $0.02 per million tokens.

Is text-embedding-3-large worth it over text-embedding-3-small?

3-large scores about 64.6 on MTEB versus 62.3 for 3-small — a real but modest retrieval gain — at 6.5× the price ($0.13 vs $0.02 per million tokens) and 2× the vector size (3072 vs 1536 dims). For most RAG, 3-small is the better value; reach for 3-large only when retrieval quality is the bottleneck and the storage cost is acceptable. Both support Matryoshka truncation to shrink the vectors.

What is a good open-source alternative to OpenAI embeddings?

bge-large-en-v1.5 (MIT) and gte-large-en-v1.5 (Apache-2.0) both match or beat OpenAI's text-embedding-3-large on the English MTEB board and are free to run on your own GPU. For non-English content including Sinhala and Tamil, multilingual-e5-large is the open pick. The trade-off is that you supply the compute — there is no per-token API charge, but you pay for the GPU.

How much does it cost to embed 1 million documents?

It depends on document length. At an average 500 tokens per document, one million documents is 500 million tokens. With text-embedding-3-small at $0.02 per million tokens that is 500 × $0.02 = $10 for a one-time pass; text-embedding-3-large costs 500 × $0.13 = $65. Open-weight models cost $0 in API fees. Enter your own document count and length above for an exact figure.

What does the MTEB score actually measure?

MTEB (Massive Text Embedding Benchmark) averages a model's performance across dozens of tasks — retrieval, classification, clustering, reranking, semantic similarity, and more. The Average column is the headline number; the Retrieval column is the most relevant one for RAG and search. Scores here are a dated snapshot of MTEB(eng, v1); the live leaderboard updates continuously, so treat them as a guide, not gospel.

Why do some models show 'n/a' for MTEB?

Some providers — Voyage, Mistral, and Google's gemini-embedding-001 among them — benchmark on their own private suites or on the multilingual MTEB v2 board rather than the English MTEB v1 board used here. Rather than invent a number that isn't directly comparable, those rows show n/a and are left out of the 'best quality' ranking. Their price, dimensions, and context length are still shown.

Do embedding dimensions matter for cost?

Yes — but for storage cost, not embedding cost. The per-token price is the same regardless of output dimensions, but larger vectors take more disk and RAM in your vector database and make similarity search slower. A 3072-dim vector costs roughly 4× the storage of a 768-dim one. Models marked Matryoshka can be truncated to a smaller dimension with only a small quality loss.

How is the 'Best value' recommendation calculated?

Best value maximises MTEB average divided by projected cost (quality per dollar). To avoid dividing by zero, free self-hosted models use a one-cent floor for the divisor, so among free models the ranking is simply by raw MTEB score. The Cheapest card shows the lowest projected cost, and Best quality shows the highest MTEB. The rule is fully deterministic — same inputs always give the same picks.

Developers · AI

Best Embedding Model: Compare Price, Dimensions & MTEB Quality

Compare 15 text-embedding models from OpenAI, Cohere, Voyage, Google, Mistral, and the open-source world on price, dimensions, max tokens, and MTEB score — then project the exact cost to embed your corpus. No signup, no ads, sources cited below.

By Induwara Ashinsana— Executive Director, Ryzera TechnologiesUpdated Jun 20, 2026

Compare embedding models

Number of documents / chunks

Whole number, zero or more.

Avg tokens per document

≈ 750 words per 1,000 tokens. PDF page ≈ 500–600 tokens.

USD → LKR rate

For the LKR cost column. Edit to match your bank's rate.

Corpus presets

Re-embedding frequency

Priority for recommendation

Sort by

Pick: Best value

gte-large-en-v1.5

Free API(self-host)

Top quality-per-dollar — free to self-host (GPU not included).

Cheapest

gte-large-en-v1.5

Free API(self-host)

Free API cost — self-host (GPU not included).

Best quality (MTEB)

gte-large-en-v1.5

Free API(self-host)

Highest MTEB average (65.4).

All 15 of 15 models

Model	Dim	Max tokens	$/1M	MTEB	License	Cost (one-time)	Cost (LKR)
gte-large-en-v1.5 Cheapest Top MTEB Alibaba	1,024	8,192	—	65.4	Apache-2.0	Free API	—
bge-large-en-v1.5 BAAI	1,024	512	—	64.2	MIT	Free API	—
bge-base-en-v1.5 BAAI	768	512	—	63.5	MIT	Free API	—
nomic-embed-text-v1.5 Nomic · Matryoshka → 64d	768	8,192	—	62.3	Apache-2.0	Free API	—
e5-large-v2 intfloat	1,024	512	—	62.3	MIT	Free API	—
multilingual-e5-large intfloat	1,024	514	—	61.5	MIT	Free API	—
text-embedding-3-small OpenAI · Matryoshka → 512d	1,536	8,191	$0.02	62.3	Proprietary	$1.00	Rs 300
voyage-3-lite Voyage AI	512	32,000	$0.02	n/a	Proprietary	$1.00	Rs 300
text-embedding-ada-002 OpenAI	1,536	8,191	$0.10	61.0	Proprietary	$5.00	Rs 1,500
embed-english-v3.0 Cohere	1,024	512	$0.10	64.5	Proprietary	$5.00	Rs 1,500
embed-multilingual-v3.0 Cohere	1,024	512	$0.10	n/a	Proprietary	$5.00	Rs 1,500
mistral-embed Mistral	1,024	8,192	$0.10	n/a	Proprietary	$5.00	Rs 1,500
text-embedding-3-large OpenAI · Matryoshka → 256d	3,072	8,191	$0.13	64.6	Proprietary	$6.50	Rs 1,950
gemini-embedding-001 Google · Matryoshka → 768d	3,072	2,048	$0.15	n/a	Proprietary	$7.50	Rs 2,250
voyage-3-large Voyage AI · Matryoshka → 256d	1,024	32,000	$0.18	n/a	Proprietary	$9.00	Rs 2,700

Open-weight models cost $0 in API fees — you pay for GPU/compute instead, which this tool does not estimate (the value score floors the divisor at $0.01so free models rank by raw MTEB). See “How it works” below.

All math runs in your browser. Prices & scores are a static 2026-06-20 snapshot — see Sources below.

How it works

Choosing a text-embedding model comes down to four numbers: how much it costs per token, how good its vectors are at retrieval, how many tokens it accepts per call, and how large each output vector is. This tool puts all four side by side for 15 of the most-used models and then projects the cost to embed your own corpus.

The cost projection uses one formula:

Total tokens = number of documents × average tokens per document.
Embedding cost = (total tokens ÷ 1,000,000) × price per 1M tokens. Each model's price comes from its provider's public pricing page, cited below.
Open-weight models(bge, gte, e5, nomic) show $0 API cost with a “self-host” label. We deliberately do not invent a GPU/compute price — that depends on your hardware — and link to the self-hosting cost calculator instead.
Monthly re-embedding multiplies the one-time cost by 12 to show annualised spend; one-time shows the single figure.

Quality is the MTEB (Massive Text Embedding Benchmark) average, displayed verbatim from the leaderboard snapshot — never computed by this page. The MTEB(eng, v1) — snapshot 2026-06-20is used so every score is comparable; the Retrieval sub-score is the most relevant one for search and RAG. Where a provider only publishes private or multilingual benchmarks, the MTEB cell shows “n/a” and that model is excluded from the quality ranking rather than guessed at.

The three recommendation cards are deterministic. Cheapest is the lowest projected cost. Best quality is the highest MTEB average. Best value maximises MTEB ÷ max(cost, $0.01) — quality per dollar — with the one-cent floor letting free self-hosted models rank by their raw MTEB instead of dividing by zero. Because all rates and scores are constants, identical inputs always produce identical output. The per-document and bulk cost formulas are cross-checked against each other in the data module to guard against arithmetic drift.

Worked examples

Freelancer RAG index (one-time)

200,000 chunks × 400 tokens = 80,000,000 tokens

Total tokens: 200,000 × 400 = 80,000,000 (80M)
text-embedding-3-small @ $0.02/M: 80 × 0.02 = $1.60
text-embedding-3-large @ $0.13/M: 80 × 0.13 = $10.40
Cohere embed-english-v3.0 @ $0.10/M: 80 × 0.10 = $8.00
voyage-3-lite @ $0.02/M: 80 × 0.02 = $1.60
bge-large-en-v1.5 (open): $0 API — self-host on a GPU

Startup knowledge base (re-embedded monthly)

1,000,000 docs × 600 tokens = 600,000,000 tokens, annualised ×12

Total tokens: 1,000,000 × 600 = 600,000,000 (600M)
text-embedding-3-small: 600 × $0.02 = $12 → ×12 = $144/yr
text-embedding-3-large: 600 × $0.13 = $78 → ×12 = $936/yr
mistral-embed @ $0.10/M: 600 × 0.10 = $60 → ×12 = $720/yr

Boundary check (one-time)

Exactly 1,000,000 tokens — 1,000 docs × 1,000 tokens

Total tokens: 1,000 × 1,000 = 1,000,000 (1M)
text-embedding-3-small @ $0.02/M: 1 × 0.02 = $0.02
Zero documents on any model: $0.00 — no error, no negative cost

Frequently asked questions

Sources & references

Prices and MTEB scores on this page are a static snapshot last verified on 2026-06-20. Embedding pricing and the leaderboard change often — this page is reviewed quarterly. Spotted a stale number? Email me and I'll update it.

Related tools

LiveAI

Text-to-Speech Compare

Side-by-side comparison of the major hosted text-to-speech (TTS) APIs — ElevenLabs, OpenAI, Google Cloud, Azure AI Speech, Amazon Polly, PlayHT, Murf, Cartesia and Deepgram Aura — by price per character, voice cloning, streaming latency, language coverage, output formats, commercial-use terms and published naturalness. Pick your providers, enter your monthly volume in characters, words or minutes, and rank them by projected cost. Every figure cites the vendor source.

Open tool

LiveAI

Speech-to-Text Compare

Side-by-side comparison of the major hosted speech-to-text (STT) APIs — OpenAI Whisper, Deepgram, AssemblyAI, Google Cloud, Azure AI Speech, Amazon Transcribe, Groq, ElevenLabs Scribe and Rev AI — by price per minute, real-time vs batch support, diarization, word timestamps, language coverage and published WER. Pick your providers, enter your monthly audio volume, and rank them by projected cost. Every figure cites the vendor source.

Open tool

LiveAI

Vector DB Compare

Compare the major vector databases — Pinecone, Weaviate, Qdrant, Milvus/Zilliz, Chroma, pgvector, Redis and MongoDB Atlas — on free tier, license, hosting model, max dimensions, index types (HNSW/IVF/DiskANN) and hybrid search. Pick 2–6, sort any column, and read a cited 'which to choose' verdict. Free, no signup.

Open tool

Rate this tool

Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a stale price, a missing model, or a better MTEB source?

Email me at [email protected] — most fixes ship within 24 hours.