induwara.lk
induwara.lkText · Dictation

Speech to Text — dictate any language in your browser

Press Start, speak, and watch your words appear live. The Web Speech API drives Chrome, Edge, and Safari's built-in recognition engine — no signup, no upload, English, Sinhala, Tamil, and 30+ more languages. Export the transcript as plain text, SubRip (.srt), or WebVTT subtitles.

By Induwara AshinsanaUpdated May 11, 2026
Speech to TextWeb Speech API · in-browser
Keep listening until you press Stop. Off ends after one sentence.
Show partial transcripts in grey before they're finalized.
Try saying

Ready

0:00

Press Start to dictate. Your speech is transcribed live and appears here.

Runs entirely in your browser's recognition engine. The page never receives audio or text on a server.0 / 50,000 chars

Words
0
Characters
0
Speaking time
0:00
Words per minute
Start speaking to populate
Download

Sources: language list and event handling follow the WICG Web Speech API draft; word/sentence counts use Unicode property classes (\p{L}\p{M}\p{N}) so Sinhala and Tamil syllables stay together. Full citations in the Sources section below.

How it works

The tool is a thin React layer over the SpeechRecognition interface from the WICG Web Speech API draft. There is no transcription model shipped with this page — when you press Start, a recognition object is created (via window.SpeechRecognition or window.webkitSpeechRecognition), configured with the language tag and continuous / interim flags, and the browser starts streaming microphone audio to whichever recognition backend it ships with. Chrome and Edge use cloud engines (Google and Microsoft respectively); Safari since iOS 14.5 and macOS Sonoma runs recognition on-device for installed language packs.

Four pieces sit between the microphone and the transcript area:

  1. Permission and start. Calling recognition.start() triggers the browser's microphone permission prompt the first time you visit the page. The user gesture (the Start button click) is required by all engines — programmatic auto-start would be rejected with a not-allowed error.
  2. Result handling. The engine fires result events with a SpeechRecognitionResultList. Each entry has isFinal true (committed) or false (interim). The tool walks results from resultIndex forward — already-finalized entries don't need to be re-processed — and appends every final chunk to the segments array with a timestamp measured against the Start moment.
  3. Stat counting. Word and sentence counts use Unicode property classes (\p{L}\p{M}\p{N}), so Sinhala syllable clusters like ආයුබෝවන් and Tamil clusters like வணக்கம் count as one word each. The combining marks (\p{M}) keep the cluster together; without them, Sinhala vowel signs would split each base letter into its own word. WPM is straight division: words ÷ (durationMs ÷ 1000) × 60.
  4. Subtitle export. SubRip (HH:MM:SS,mmm) and WebVTT (HH:MM:SS.mmm) timestamps are built from the per-segment millisecond offsets. The two formats differ by exactly one character — the decimal separator — and the "Verified · methodology checked" badge confirms that SRT(t).replace(",", ".") === VTT(t) for every sampled timestamp on every page load.

Privacy lives at the boundary between the browser and the recognition backend. The page never touches a server itself — every state transition (Start, Stop, Clear, Copy, Download) is local JavaScript. But the audio leaving the browser, when Chrome or Edge is the host, travels to Google or Microsoft for transcription. That trade-off is fundamental to the way Chrome implements the API, and there is no page-level switch to flip. Safari with an installed language pack is the only fully on-device path; the tool labels the engine in the startup notice so you can choose accordingly.

The transport row shows a pulsing dot while the engine is listening and a monotonic elapsed timer keyed off performance.now() (immune to system clock changes). Interim results render in italic muted grey; final segments switch to body weight. When you press Stop, the engine fires one last result batch with isFinal=true for whatever it had buffered, then end fires and the tool tears down the listeners.

Worked examples

Short English sentence

Spoken: "Hello world, this is a test."

  1. Click Start → mic permission prompt (first session only)
  2. Engine streams → onresult fires with interim results
  3. Final: 'Hello world, this is a test.' (one segment)
  4. countWords → 6 (regex \p{L}\p{N}\p{M}+ matches Hello/world/this/is/a/test)
  5. countSentences → 1 (one terminator)
  6. WPM at 4 s → 6 / 4 × 60 = 90 wpm

Sinhala greeting

Spoken (si-LK): "ආයුබෝවන් ලෝකය"

  1. Set Language → si-LK
  2. Engine: Chrome streams to Google's si-LK model; Safari uses on-device pack
  3. Final segment: 'ආයුබෝවන් ලෝකය'
  4. countWords → 2 (combining marks keep each syllable cluster intact)
  5. Without \p{M} in the regex → 9 (every base letter counted alone, wrong)
  6. Cleaned: identical (no adjacent duplicates)

Long dictation with engine repeats

Spoken: "I want to to go home" (repeated "to" — common ASR artifact)

  1. Final transcript: 'I want to to go home'
  2. countWords → 6
  3. cleanTranscript → 'I want to go home' (adjacent duplicate collapsed, case-insensitive)
  4. Copy uses the cleaned form so paste is publication-ready
  5. Original segments still drive the SRT / VTT exports — timestamps stay accurate

Subtitle export at 1 m 2 s

One segment: { text: 'Welcome to Colombo.', startMs: 62000, endMs: 64500 }

  1. formatTimestampSRT(62000) → '00:01:02,000'
  2. formatTimestampSRT(64500) → '00:01:04,500'
  3. buildSRT → '1\n00:01:02,000 --> 00:01:04,500\nWelcome to Colombo.\n'
  4. buildVTT → 'WEBVTT\n\n00:01:02.000 --> 00:01:04.500\nWelcome to Colombo.\n'
  5. VTT = SRT with ',' replaced by '.' (verified by the badge cross-check)

Frequently asked questions

Sources & references

The language list, error mapping, and subtitle timestamp helpers on this page were last cross-checked against the upstream specs on 2026-05-11. The page is reviewed whenever a Chromium recognition regression lands or the WICG Speech API draft ships a new revision. If you spot engine behaviour that disagrees with the methodology above, email me below.

Related tools

Rate this tool
Be the first to rate

Comments & feedback

Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.

Hit an engine error, a missing language, or want a different export format?

Email me at [email protected] — most fixes ship within 24 hours.