induwara.lk
induwara.lkImage · Privacy-first

Image to Text (OCR) — Free, In-Browser, No Signup

Drop a photo, screenshot, or scan and get plain text out. The Tesseract LSTM model runs entirely in your browser — your image never touches a server. Supports English, Sinhala, and Tamil. First run downloads ~5 MB of language data; later runs start in a second.

By Induwara AshinsanaUpdated May 11, 2026
Image to Text (OCR)Tesseract.js · in-browser
Files stay on your device

Everything runs in your browser. Nothing is uploaded.

What this does

Pick a photo, screenshot, or scan — the Tesseract LSTM model below recognises text inside it entirely in your browser. The first run downloads ~4 MB of language data; later runs reuse the cached data and start in a second. Output is plain text you can copy or download.

Languages
Page segmentation
Ready to scan text

Add an image to get started.

Powered by tesseract.js (Tesseract LSTM, compiled to WebAssembly). Trained data fetched once from the jsDelivr tessdata CDN and cached in your browser. Last verified 2026-05-11.

How it works

The tool runs the Tesseract LSTM optical character recogniser — the neural rewrite of the engine Google open-sourced in 2005 — through tesseract.js, a WebAssembly wrapper that loads the engine and trained data into a Web Worker inside your browser. The worker fetches the language .traineddata files once from the jsDelivr tessdata CDN, caches them in your browser, and every subsequent image is recognised without any network call on the photo bytes themselves.

A run on one image goes through five deterministic steps:

  1. Validate. The file must be JPG, PNG, WebP, BMP, or TIFF, under 25.0 MB, and at most 4,096 × 4,096 px. Rejected files leave a specific reason on screen rather than failing silently.
  2. Decode. Your browser's native image decoder reads the bytes into an ImageBitmap. No third-party decoder runs on your image.
  3. Pre-process. Tesseract.js converts the bitmap to greyscale, normalises the DPI to 70 (its default), and hands the buffer to the WebAssembly engine running in a Web Worker.
  4. Recognise. Tesseract's LSTM recogniser walks each layout block, line, and word, choosing the most likely characters given the picked page-segmentation mode and the loaded language model. It returns text plus a per-character, per-line, and per-image confidence score.
  5. Post-process. When "Tidy paragraphs" is on, the raw output is de-hyphenated across line breaks (“informa-\ninformation” → “information”) and hard-wrapped lines are collapsed so each paragraph is a single line. The original raw output is available for download too — useful when the source layout matters more than readability.

Processing time per image scales linearly with megapixels. On a 2022 MacBook Air running the WASM engine, a 1 MP English screenshot finishes in about 6 seconds; a 3 MP page takes about 14 seconds. Sinhala and Tamil are roughly 50% slower because the orthography has more glyph classes. The page exposes two estimators so you can sanity-check the wait before clicking. The closed-form throughput estimator computes seconds = megapixels × secondsPerMP + 2 s where secondsPerMP is 4 for English and 6 for Sinhala or Tamil. The lookup estimator interpolates a piecewise table calibrated against MacBook Air (M2) and Pixel 7 runs. For a 3 MP English job the closed form predicts 14.0 s and the lookup predicts 14.0 s — the two agree to within ~10%.

Page-segmentation mode controls how Tesseract carves the image before recognition. PSM 3 (Auto) is the default and the right call for most screenshots and scans. PSM 6 (Single block) treats the whole image as one paragraph — best for a tightly cropped quote. PSM 7 (Single line) is for a single line of text such as a name tag. PSM 11 (Sparse text) finds isolated words in no particular order, which is what charts, menus, and signage need.

Worked examples

English screenshot of a paragraph

1000×400 px PNG screenshot of a single paragraph of clean black-on-white text from a news website.

  1. Input: 1,000 × 400 px PNG · 140.0 KB
  2. Pixels: 0.40 MP · Languages: English
  3. Estimated processing on a laptop CPU: 3.6 s
  4. Outcome: 0.4 MP × 4 s/MP + 2 s setup ≈ 3.6 s. Output is the paragraph as a single block of text with confidence ≈ 92%.

Scanned A4 invoice

2000×3000 px JPEG of a printed invoice — header, line items, total, footer.

  1. Input: 2,000 × 3,000 px JPEG · 2.9 MB
  2. Pixels: 6.00 MP · Languages: English
  3. Estimated processing on a laptop CPU: 26.0 s
  4. Outcome: 6.0 MP × 4 s/MP + 2 s setup ≈ 26 s. With PSM 3 (auto) the columns are preserved as separate paragraphs in the output.

Sinhala newspaper clipping

1500×2000 px PNG of a printed Sinhala paragraph from a daily newspaper. Multi-language jobs add Sinhala (සිංහල) data, ~10 MB on first load.

  1. Input: 1,500 × 2,000 px PNG · 1.7 MB
  2. Pixels: 3.00 MP · Languages: Sinhala + English
  3. Estimated processing on a laptop CPU: 20.0 s
  4. Outcome: 3.0 MP × 6 s/MP + 2 s setup ≈ 20 s. Accuracy ~85–92% on clean print, lower on stylised or cursive fonts.

Languages & page-segmentation modes

Three language packs are loaded on this page. Multi-language jobs are supported — Tesseract joins the codes with "+" and recognises in the order you picked.

LanguageCodeFirst-load sizeNotes
English · Englisheng4 MBMost accurate on screenshots, printed documents, and signage.
Sinhala · සිංහලsin10 MBTrained on Sinhala print. Cursive and handwriting accuracy is lower.
Tamil · தமிழ்tam9 MBTrained on Tamil print. Cursive and handwriting accuracy is lower.
PSMModeBest for
3Auto (default)Fully automatic page segmentation. Best for screenshots, scans, and most photos.
6Single blockAssume a uniform block of text. Best for a tightly cropped paragraph.
7Single lineTreat the image as one line. Best for a name tag, slogan, or single sentence.
11Sparse textFind as much text as possible in no particular order. Best for menus, charts, signs.

Confidence grading

Tesseract reports a single confidence number between 0 and 100, averaged across every recognised character. The tool grades it using these thresholds:

  • 95 · High Likely usable as-is. Skim for any unusual characters.
  • 78 · Medium Mostly correct. Expect a handful of edits per page.
  • 60 · Low Substantial editing needed. Try a sharper image or a different PSM.
  • 35 · Very low Re-shoot the image with better lighting, focus, and contrast.

Confidence is a heuristic. A High-confidence result can still mis-read look-alike characters (0/O, 1/l, rn/m). Always proofread.

Frequently asked questions

Sources & references

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Found a bug, edge case, or want to suggest an improvement?

Email me at [email protected] — most fixes ship within 24 hours.