Sinhala Unicode Converter — Legacy Font to Unicode
Paste Sinhala text saved in an old legacy font (the FM-Abhaya family) and get clean, copy-paste-ready Sinhala Unicode that works on Facebook, WhatsApp, government eServices, and MS Word — or convert the other way for legacy-font print layouts. Runs entirely in your browser; nothing is uploaded.
How it works
For roughly two decades before Unicode adoption, almost all digital Sinhala was typed in font-encoded“legacy” fonts — the FM and DL families, Kaputa, Bhashitha and the rest. In those fonts each Sinhala glyph occupies a Latin or ASCII code position, and the text is stored in visual order (the order glyphs appear left-to-right on screen). On any device without that exact font installed, the same bytes render as Latin gibberish such as Y%S ,xld instead of ශ්රී ලංකා.
Sinhala Unicode (the Unicode block U+0D80–U+0DFF) instead stores text in logical order — pronunciation/typing order — and lets the system shape it at render time. Because there is one standard code point per character, Unicode displays identically on every modern device with no special font. Conversion between the two is a deterministic, two-stage transform, not a guess:
- Glyph substitution. Each legacy code unit (and known multi-glyph ligature, longest match first) is replaced with its target Unicode code point using the per-font mapping table. Longest-match-first scanning means a two-key sequence such as
wd→ ආ is matched before the singlew→ අ. - Visual-to-logical reordering. Several dependent vowel signs — the kombuva family ෙ ේ ෛ — are drawn before their consonant in legacy fonts but must follow it in Unicode. So legacy visual ෙ + ක becomes logical ක + ෙ = කෙ. The reverse direction inverts both stages.
The mapping points and ordering rules come from the Unicode Standard Sinhala chart, the SLS 1134 national standard for Sinhala character coding, and the University of Colombo School of Computing (UCSC) Language Technology Research Laboratory legacy-font research. Every mapping in this build is checked to round-trip losslessly— converting Legacy → Unicode → Legacy returns the original text — which is how the tool earns its “round-trip verified” badge. Glyphs outside the verified FM-Abhaya core set are never guessed: they pass through unchanged and are counted for you as unmapped, so a partial conversion is always visible rather than silent.
Worked examples
Frequently asked questions
Sources & references
- The Unicode Standard — Sinhala block U+0D80–U+0DFF (code chart)
- SLS 1134 — Sri Lanka Standard for Sinhala Character Code (SLSI)
- UCSC Language Technology Research Laboratory — legacy-font research
- ICTA — Sinhala Unicode / locale resources
The mapping tables on this page were last cross-checked against the Unicode Sinhala chart and the cited references on 2026-06-06. Coverage of additional FM, DL, Kaputa, and Bhashitha glyphs expands as each is verified against the LTRL reference — a mapping is added only once it is confirmed, because a wrong mapping produces silently-incorrect text.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Found a glyph that didn't convert, or want another legacy font added?
Email me at [email protected] — most fixes ship within 24 hours.