URL Encoder & Decoder
Percent-encode text for safe inclusion in URLs, or decode an encoded URL back to plain text. Supports component, whole-URI, and form-urlencoded profiles. UTF-8 round-trip verified, runs entirely in your browser.
How it works
A URL has a fixed structure (scheme, host, path, query, fragment) and a fixed list of characters with structural meaning. Any character outside that list — including spaces, most punctuation, and every non-ASCII byte — must be escaped before it can ride inside a URL. The escape format is the same one defined by RFC 3986 §2.1: a literal % followed by two hex digits naming the byte's value. The three profiles on this page differ only in which characters they escape:
- UTF-8 bytes from text. Any non-ASCII input is first converted to bytes by the standard
TextEncoder. That is why the Sinhala letter ශ (U+0DC1) becomes three bytesE0 B7 81in the encoded output rather than one — UTF-8 expresses code points above U+007F using two, three, or four bytes, and each byte is escaped independently. - The unreserved set. RFC 3986 §2.3 names 66 characters that always survive untouched: A-Z, a-z, 0-9, and the four marks
- . _ ~. Every encoder on this page leaves them alone. - The reserved set (gen-delims and sub-delims).
: / ? # [ ] @split a URI into its parts.! $ & ' ( ) * + , ; =separate fields inside one part. The Component profile escapes every reserved character because the value is going inside one segment. The Whole URI profile preserves them because the input is already structured. The Form profile is the Component profile with one extra step: a literal space is emitted as+instead of%20, matching the WHATWG application/x-www-form-urlencoded serializer used by HTML forms andURLSearchParams. - Decoder = the inverse, with sanity checks. The decoder scans for malformed escapes first — every
%must be followed by two hex digits, or a clear error fires with the offending position. If the syntax is sound, the bytes are gathered and run through a fatal-mode UTF-8 decoder. Form mode additionally replaces every literal+with a space before the percent-decoding step.
The implementation is built on the platform: encoding uses encodeURIComponentand encodeURI as defined in ECMA-262; URL parsing uses the WHATWG URL constructor; UTF-8 conversion uses the WHATWG Encoding TextEncoder. The page wraps these with deterministic error messages and a round-trip verifier that runs once per load over a probe containing ASCII reserved characters, Sinhala, and emoji. The green “Verified · round-trip” badge in the tool header confirms all three profiles passed.
Worked examples
Frequently asked questions
Sources & references
- IETF RFC 3986 — Uniform Resource Identifier (URI): Generic Syntax
- WHATWG URL Living Standard — URL parser and form-urlencoded serializer
- ECMA-262 — encodeURIComponent / encodeURI / decodeURIComponent specification
- WHATWG Encoding Standard — TextEncoder / TextDecoder (UTF-8)
- MDN Web Docs — Percent-encoding glossary entry
The encoder, decoder, and parser on this page were last cross-checked against ECMA-262 reference behaviour, the WHATWG URL parser, and Python's urllib.parse on 2026-05-11. Any algorithmic change bumps that date. If you spot a mismatch with another implementation, email me below.
Related tools
Comments & feedback
Spotted a bug or want an improvement? Tell us — our team reviews every comment, and good ideas get built. Comments are public and anonymous.
Spotted a decoding edge case, a profile mismatch, or want to suggest a new feature?
Email me at [email protected] — most fixes ship within 24 hours.