Question 1

How accurate is this token counter?

Accepted Answer

On common English prose, the estimate is within ±5 % of each vendor's official tokenizer. On code, emoji-heavy text, or non-Latin scripts the gap can widen to ±15 %. Two independent methods (a regex BPE pre-tokenizer and a chars-per-token ratio) are computed for every model and the result card flags low confidence when they disagree by more than 25 %.

Question 2

Why don't you just use OpenAI's tiktoken library directly?

Accepted Answer

Tiktoken's encoding tables are 1.7–3.5 MB each — shipping them client-side would push the page weight past our 400 KB performance target and make the tool feel sluggish on mobile. The regex method gives a close-enough estimate without the bundle hit. If you need exact counts (e.g. before paying per token at API scale), run the text through OpenAI's online tokenizer or call the official library server-side.

Question 3

What's a 'token' and why does it matter?

Accepted Answer

Large language models read text as tokens, not characters. A token is roughly a word or a sub-word fragment — for English, OpenAI says one token ≈ 4 characters or ~0.75 words. Token count matters because each model has a maximum context window (the total tokens it can read in one request) and most APIs charge per token of input and output. Counting before you send avoids hitting limits or surprise bills.

Question 4

Does sending Sinhala or Tamil text use more tokens?

Accepted Answer

Yes — significantly more. Most modern tokenizers (cl100k_base, o200k_base, Claude, Gemini) are trained mostly on English and Western European text, so Sinhala and Tamil characters typically use 2–3 tokens per character instead of the 0.25 tokens per character common for English. A 100-word Sinhala paragraph can easily use 4–5× more tokens than a 100-word English paragraph saying the same thing. This counter applies a generic Unicode estimate; for Sinhala-heavy work, expect the real count to run higher.

Question 5

Why do GPT-4o and Claude give different counts for the same text?

Accepted Answer

Each vendor trained their own BPE vocabulary on different data. OpenAI's o200k_base (used by GPT-4o and GPT-5) has 200 000 tokens and is roughly 5 % more efficient than cl100k_base. Anthropic's Claude tokenizer is BPE-based but with a distinct vocabulary that typically produces 10–15 % more tokens than tiktoken. Gemini uses SentencePiece (similar efficiency to tiktoken). The per-vendor multipliers in this calculator are calibrated against published vendor benchmarks.

Question 6

Is the text I paste sent anywhere?

Accepted Answer

No. Tokenization runs entirely in your browser using JavaScript — the text never leaves your device. There's no API call, no server log, and no analytics tracking of input content. You can verify by opening your browser's Network tab and watching: nothing leaves while you type.

Question 7

What's the difference between the input context window and output limits?

Accepted Answer

Context window is the total of input + output tokens the model can handle in one turn. Output is usually capped much lower — GPT-5 has a 400 k context window but a 16 k–128 k max output; Claude Opus 4.7 has a 1 M context window but caps each response at 64 k tokens. This counter compares your input to the total context window. Subtract your expected output budget to know how much room you actually have for the prompt.

Question 8

How often are the model context windows updated?

Accepted Answer

Quarterly. Model windows are reviewed against vendor model-spec pages and bumped whenever a vendor publishes a change. Last verified: 2026-05-12. If you spot a stale figure, email me and I'll update.

AI Token Counter — GPT-5, Claude 4.x, Gemini 3 & Llama 4

Tokens by model

How it works

Worked examples

Frequently asked questions

Sources & references

Related tools

Word Counter

Character Counter

Text Diff Checker

Comments & feedback