Google's AI can't spell — and what it tells builders

Google's AI can't spell its own name in a screenshot doing the rounds this week, and TechCrunch has a write-up titled Why Google's AI can't spell Google (or anything else) cataloguing the embarrassment. The funny part is the screenshot. The useful part is the reason behind it. If you are building anything on top of an LLM in Colombo, Galle, or your bedroom in Kandy, the reason matters more than the joke.

I want to talk about why this happens at all, why it keeps happening to the most expensive model on earth, and what you should and shouldn't trust an AI to do when you ship something.

🔡 LLMs don't see letters, they see tokens

Every large language model — Gemini, Claude, GPT, Llama, the lot — reads text in tokens, not characters. A tokenizer chops your input into chunks that are roughly word-sized, or sub-word for unusual strings. The word Google might be a single token. The word gooooogle might be split into three. The model never gets to peek at the individual letters inside a token unless those letters happen to be their own token.

Once you internalise that, "AI can't spell" stops being mysterious.

Key takeaway: An LLM doesn't read characters. It predicts the next token from a stream of tokens. Asking it to count letters or spell things backwards is asking it to do work it has no eyes for.

A quick sense of what tokenization looks like for common words on a typical BPE tokenizer:

Input	Approx. tokens	Why
`Google`	1	Extremely common, learned as one chunk
`google.com`	2	Domain pattern split at the dot
`Googlle` (typo)	2–3	Unfamiliar, broken into pieces
`induwara.lk`	3–4	Rare domain, fully sub-worded
`ශ්‍රී ලංකා`	many	Non-Latin scripts often fragment heavily

The model is good at "what word usually follows this word." It is bad at "how many letters are in this word" because nothing in its training signal rewards counting glyphs.

🧠 So why does the biggest model on the planet still fail?

Scale doesn't fix a representation problem. You can throw a trillion parameters at predicting the next token, and the model still doesn't get character-level vision for free. It learns letter-level facts the same way it learns geography — by reading enough examples in the training corpus to memorise them statistically.

That's why models can spell cat and Sri Lanka perfectly but fall apart on:

Long compound words they haven't memorised
Brand names with deliberate misspellings (Lyft, Flickr, Tumblr)
Localised strings, transliterations, and street-name spellings
Their own product names, when those names appear inside generated UI chrome rather than free text

There is also a second factor TechCrunch hints at: rendered output is not the same as predicted text. Some Google surfaces stitch generated text into templates, post-process with smaller models, or run image-based rendering. Every additional step is another chance for letters to drop.

A model that nails "summarise this PDF" can still botch "spell the company name on the button." Different jobs, different failure modes.

🛠️ A short list of things AI is bad at, and what to use instead

I keep this mental list when I'm building. It saves me from shipping bugs that look stupid in screenshots.

Task	LLM is…	Use instead
Counting letters in a string	Bad	A `length` call, or character counter
Counting words in a paragraph	Mediocre	Word counter
Reversing a string	Bad	One line of code
Strict regex matching	Mediocre	Regex tester
Formatting JSON	Mediocre	JSON formatter
Generating a URL slug	Decent	Slug generator if you want determinism
Spelling rare proper nouns	Bad	A dictionary, or copy-paste from the source
Summarising long text	Good	LLM
Drafting code with context	Good	LLM
Rewriting tone	Good	LLM

The rule is simple: if the task is deterministic and a five-line function can solve it, don't pay an LLM to do it. You'll burn tokens, add latency, and ship the occasional misspelled brand name into a screenshot that ends up on TechCrunch.

💡 What this means if you're building on free tiers from Sri Lanka

If you're a UCSC student wiring up a side project, or a freelancer in Colombo plugging an AI feature into a client's WordPress site, the practical implications are:

Validate AI output before you display it. Run a regex check that the brand name in the response matches the brand name in your config. One assert beats a thousand apologies.
Keep deterministic logic deterministic. Word counts, character limits, URL slugs, IDs, hashes, dates — these belong in code, not in a prompt. The Sri Lanka NIC decoder is twenty lines of arithmetic, not a model call, for exactly this reason.
Treat AI as a draft layer. Generate, then check, then ship. The cheapest checker is a Zod schema or a regex. The next cheapest is a smaller, faster model used as a critic.
Don't waste context on tokenization workarounds. Asking the model to "spell carefully" or "think letter by letter" mostly inflates your bill without fixing the underlying issue. Use the right tool.

There is also a free-tier angle. Every misspelling that goes live is an SEO problem. Google itself penalises pages with broken proper nouns and obvious AI artefacts. If you're trying to rank in a competitive niche from a small island with no link budget, you cannot afford the model to mangle your own brand name in a meta description.

🌐 What this means for you

If you took one thing from the Google story, take this: an AI's confidence is not a quality signal. The model that confidently spells Goooglle in a screenshot is the same model that will confidently invent a citation, hallucinate an API method, or fabricate a quote from a Sri Lankan minister. The failure mode is identical. Only the cost varies.

So write your code as if the model is a junior intern with brilliant instincts and zero accountability. Let it draft, summarise, restructure, and rewrite. Don't let it count, spell, or commit to facts unsupervised. Wrap it in tests, regex guards, and schema validation. When the deterministic answer exists in a twenty-line function, write the function. When it lives in a public tool, use the tool.

Bottom line: Google's misspelling isn't a Google problem. It's a tokenizer problem with a Google logo on it. Build like you know that, and your users won't end up screenshotting your product for the wrong reasons.

Google's AI can't spell — and what it tells builders

🔡 LLMs don't see letters, they see tokens

🧠 So why does the biggest model on the planet still fail?

🛠️ A short list of things AI is bad at, and what to use instead

💡 What this means if you're building on free tiers from Sri Lanka

🌐 What this means for you

Keep reading

Lovable Now Deploys to Vercel: What It Means for You

Vercel Redacts Secrets in Build Logs: What It Misses

Agentic Coding Is Fast. Your Judgment Is the Moat