Question 1

Is there a free English to Sinhala translator that works offline?

Accepted Answer

Yes — this one. The whole translation runs in your browser via Meta's NLLB-200 (distilled-600M variant), an open-weights neural machine-translation model. The first time you click Translate, your browser downloads about 640 MB of model weights once and caches them; after that, every translation runs locally and works offline. No signup, no daily character limit, no telemetry. The page also works for English ↔ Tamil and 50+ other language pairs.

Question 2

How accurate is AI translation between Sinhala and English?

Accepted Answer

Surprisingly good for everyday text. On Meta's published FLORES-200 benchmark — a curated set of professionally translated sentences — NLLB-200 distilled-600M scores 44.7 chrF++ for English → Sinhala and 47.2 chrF++ for Sinhala → English (NLLB-200 paper, Table 14). Both are in the "Good" band: confident for general use, with light editing usually enough for publication. The model handles Sinhala-script proper nouns, mixed Latin/Sinhala input, and idiomatic phrasing better than the older statistical translators. Specialised domains (medical, legal, hyper-technical) still benefit from human review.

Question 3

What is the best free Tamil to English translator without signup?

Accepted Answer

For text-only translation, this page. It runs Tamil ↔ English on the same NLLB-200 backbone — 51.8 chrF++ Tamil → English on FLORES-200, which is excellent for a free open-weights model. No Google account, no character limits, no document size cap beyond the per-paste 4,000 chars. For document translation (PDF/Word), you'd combine an OCR tool with this — that's on our roadmap but not in v1.

Question 4

Can I translate documents to Sinhala without uploading them?

Accepted Answer

Paste-text translation, yes — everything happens locally in your browser, so nothing ever leaves your device. Direct PDF/Word document translation isn't in v1: paste the text out of the document and translate it here. We are planning a follow-up tool that combines our PDF text extractor with this translator so you can drop a PDF in and get a translated copy back, still entirely local.

Question 5

Why does Google Translate get Sinhala wrong sometimes?

Accepted Answer

Same root cause as for any neural translator: Sinhala is a relatively low-resource language compared to Spanish or French, so training data is sparser and rarer constructions are less well-modelled. Google Translate also rounds aggressively when it's unsure, picking a confident-sounding but possibly wrong phrasing. NLLB-200 (the model behind this tool) was specifically optimised for low-resource pairs including Sinhala and Tamil — its FLORES-200 chrF++ scores on these pairs are noticeably better than the public state of the art from a few years ago. That said, no model is perfect on Sinhala yet, which is why we surface the chrF++ band on every translation.

Question 6

How does the tool keep Sri Lankan institution names consistent?

Accepted Answer

A curated override dictionary of 53+ institutional and place names — universities, ministries, banks, major cities — is loaded before translation. When the toggle is on, the source-language form (e.g. "University of Jaffna" in English) is replaced with a sentinel token, the rest of the sentence is translated, and the sentinel is then restored using the target-language entry ("யாழ்ப்பாணப் பல்கலைக்கழகம்" in Tamil). Each dictionary entry cites its source — the institution's own website where available. You can switch the toggle off if you'd rather see the model's free translation.

Question 7

How long does the model download take, and is it really one-time?

Accepted Answer

About 640 MB on the first run — typically one to three minutes on a Sri Lankan home connection (10–30 Mbps). The browser caches the weights through transformers.js' built-in IndexedDB store; subsequent translations on this page (or on any other induwara.lk page that uses NLLB-200) load instantly. The cache persists across sessions until you clear browser data or storage runs low.

Question 8

Why does the tool split my paragraph into sentences before translating?

Accepted Answer

Sequence-to-sequence translators like NLLB-200 are most accurate on single sentences. Long paragraphs tend to lose words near the end as the decoder runs against its token budget. By sentence-splitting first and translating each piece separately, the tool keeps every sentence inside the model's comfort zone and lines up the source ↔ translation segment-by-segment so you can verify the result phrase-by-phrase. The original line breaks and punctuation are restored when the translated sentences are reassembled.

Question 9

Is the model output deterministic — same input, same output?

Accepted Answer

Yes. All three modes (Fast / Standard / Quality) run beam search with sampling disabled, which means the model's output is a pure function of (input, source language, target language, mode). Same paragraph, same mode, same pair → same translation, on any browser or machine. The Fast and Standard modes differ in beam width; Quality also widens the length penalty. None of them roll dice.

Question 10

Can I use this for commercial work — translating my company's website?

Accepted Answer

No, not under the model's current licence. NLLB-200's weights are released under Creative Commons BY-NC 4.0 — Non-Commercial. induwara.lk is non-commercial in its first 90 days (per our public mission), and the page surfaces the licence so you know what you're using. For commercial deployment you'd need to host a permissively-licensed alternative (M2M-100, MarianMT pairs) or pay for a translation API.

Question 11

What is the longest text I can translate in one go?

Accepted Answer

4,000 characters per translation. Longer than that, split into paragraphs and translate each one separately — the result is the same quality, just easier on your laptop's fan. The 4,000-char cap also keeps each browser tab inside the model's per-paste compute budget; a single 4,000-char paragraph in Standard mode is usually 5–25 seconds on an average laptop.

Question 12

Which browsers and devices does this work on? I'm on a phone.

Accepted Answer

Any modern desktop or laptop browser with WebAssembly SIMD — Chrome, Edge, Firefox, Safari all qualify. WebGPU is detected automatically and used when available (newer Chrome and Edge), which makes inference 2–4× faster. On mobile: technically supported, but the ~640 MB model download and the per-translation compute load make it impractical on phones with less than ~3 GB of RAM free. If your phone struggles, try Fast mode and short inputs.

Question 13

When were the model card, scores, and language list last verified?

Accepted Answer

2026-05-12. The NLLB-200 paper, model card on Hugging Face, FLORES-200 score table, transformers.js library version, and the Sri Lankan institution override dictionary were all cross-checked on this date. Numbers and licence terms are reviewed quarterly; the override dictionary is updated whenever an institution publishes a name change.

English to Sinhala Translator — Tamil & 63+ Languages, In Your Browser

How it works

1. Pre-processing

2. Source-language detection

3. Entity preservation

4. Translation (beam search)

5. Post-processing and quality band

Worked examples

Frequently asked questions

Sources & references

Related tools

Language Detector

AI Audio Transcriber

Text Summarizer

Comments & feedback