Question 1

Is my image uploaded to a server?

Accepted Answer

No. The image is loaded into a canvas, run through the Tesseract LSTM model in your browser via WebAssembly, and the extracted text appears in the same tab — all without your file leaving the device. The first run downloads language data once from the jsDelivr tessdata CDN, but the image bytes themselves are never sent anywhere. Open the browser's Network tab to verify.

Question 2

How accurate is browser-based OCR compared to Google Vision?

Accepted Answer

On clean, well-lit, print-quality English text — screenshots, scanned documents, web captures — Tesseract LSTM reaches 95–99% character accuracy, which is competitive with paid cloud services. Accuracy drops on handwriting (Tesseract is print-trained), heavily skewed photos, mixed scripts, and stylised fonts. For Sinhala and Tamil print expect 85–95%; for handwriting expect to edit substantially.

Question 3

Which languages are supported?

Accepted Answer

Three are loaded on this page: English (4 MB), Sinhala — සිංහල (10 MB), and Tamil — தமிழ் (9 MB). You can pick one or several at once; Tesseract recognises in the order you select. Multi-language jobs are slower and download more data on first run. The full Tesseract project supports 100+ scripts — open an issue if you'd like one added here.

Question 4

Why is the first run so slow?

Accepted Answer

On the first OCR run the browser fetches the WebAssembly engine (~2 MB) and the trained-data files for each picked language (4–10 MB each), then warms up Tesseract. After that the engine and data are cached by the browser, and subsequent runs start in under a second. Clearing browser data resets the cache; private/incognito windows usually do not cache between sessions.

Question 5

What is page segmentation, and which setting should I pick?

Accepted Answer

The page segmentation mode (PSM) tells Tesseract how to slice the image before recognition. PSM 3 (Auto) is the right call for most screenshots and scans. Pick PSM 6 (Single block) for a tightly cropped paragraph, PSM 7 (Single line) for a name tag or banner, or PSM 11 (Sparse text) for charts, menus, or photos with scattered labels. If the result is missing obvious text, try a different PSM before assuming the model failed.

Question 6

Can it read handwriting?

Accepted Answer

Not reliably. The Tesseract LSTM is trained on print, so it treats handwriting like an unusual font and tends to misread or skip cursive strokes. Best-case clean block-letter handwriting (caps, ruled paper, dark ink) reaches around 60–75% accuracy; cursive is much worse. For handwriting at scale you want a dedicated handwriting recogniser; this tool is best for printed text and screenshots.

Question 7

What does the confidence number mean?

Accepted Answer

Tesseract reports a per-image confidence between 0 and 100, averaged across all recognised characters. We grade it: 85+ is High (likely usable as-is), 70–84 is Medium (skim for errors), 50–69 is Low (substantial editing), under 50 is Very low (re-shoot the image with better lighting, focus, or contrast). Confidence is a heuristic, not ground truth — even a 95% confidence run can mis-read look-alike characters like 0/O, 1/l, or rn/m.

Question 8

What are the file size and dimension limits?

Accepted Answer

Up to 25.0 MB per file and 4,096 × 4,096 px maximum dimensions. JPG, PNG, WebP, BMP, and TIFF are accepted. The dimension cap protects mid-range phones from out-of-memory crashes when decoding very large scans; resize a larger photo first with our Image Resizer.

Question 9

Does it work offline?

Accepted Answer

After the first successful run on a given device, yes. The WebAssembly engine and trained-data files live in your browser's Cache Storage / IndexedDB, so re-opening this page over a flaky connection still works — the recognition itself needs no network. If you clear browser data, the engine and data are downloaded again.

Question 10

How is this different from i2OCR or OnlineOCR.net?

Accepted Answer

Those sites upload your image to a remote server and rate-limit free use (5 or 10 images per day). This tool runs the same class of model (Tesseract LSTM) entirely in your browser, has no quota, no signup, and your image bytes never leave the device. The trade-off: the first run downloads ~5–25 MB of engine and language data; cloud services hide that download behind their infrastructure but charge for it in other ways.

Language	Code	First-load size	Notes
English · English	eng	4 MB	Most accurate on screenshots, printed documents, and signage.
Sinhala · සිංහල	sin	10 MB	Trained on Sinhala print. Cursive and handwriting accuracy is lower.
Tamil · தமிழ்	tam	9 MB	Trained on Tamil print. Cursive and handwriting accuracy is lower.

PSM	Mode	Best for
3	Auto (default)	Fully automatic page segmentation. Best for screenshots, scans, and most photos.
6	Single block	Assume a uniform block of text. Best for a tightly cropped paragraph.
7	Single line	Treat the image as one line. Best for a name tag, slogan, or single sentence.
11	Sparse text	Find as much text as possible in no particular order. Best for menus, charts, signs.

Image to Text (OCR) — Free, In-Browser, No Signup

How it works

Worked examples

Languages & page-segmentation modes

Confidence grading

Frequently asked questions

Sources & references

Related tools

PDF to JPG

Image Format Converter

Background Remover

Comments & feedback