Online Tools Toolshu.com Log In Sign Up

What Is %20 in a URL? URL Encoding Explained — With the Traps Developers Fall Into

Original Author:bhnw Released on 2026-04-07 15:40 4 views Star (0)

A Real Bug to Start With

A backend colleague once reported that a parameter value received from the frontend was being truncated. The frontend had passed a string containing an & character as a query parameter — but since it was concatenated directly into the URL without encoding, the backend naturally treated it as a parameter delimiter and split the value there.

This is exactly why URL encoding exists. Certain characters carry special meaning in URLs. When data contains those characters, they must be encoded first to be transmitted safely.


The Mechanics: Percent-Encoding

The formal name for URL encoding is percent-encoding. The rule is straightforward:

Convert the character to its UTF-8 bytes, then represent each byte as % followed by two hexadecimal digits.

Take a space character:

space → UTF-8 byte: 0x20 → URL encoded: %20

Take the Chinese character "中":

中 → UTF-8 bytes: 0xE4 0xB8 0xAD → URL encoded: %E4%B8%AD

So %20 in a URL is a space, and %E4%B8%AD%E6%96%87 is the Chinese word for "Chinese."

Which Characters Need Encoding

RFC 3986 defines two categories of URL characters:

  • Unreserved characters (never encoded): A-Z a-z 0-9 - _ . ~
  • Reserved characters (special meaning in URLs; must be encoded when used as data): : / ? # [ ] @ ! $ & ' ( ) * + , ; =
  • Everything else (Chinese, Japanese, spaces, symbols): must always be encoded

encodeURI vs encodeURIComponent: One Rule to Tell Them Apart

JavaScript provides two functions, and developers frequently confuse them. There is exactly one question to ask: are you encoding an entire URL, or a single parameter value within a URL?

encodeURI: For Encoding a Complete URL

encodeURI preserves URL structural characters (because they are part of the URL itself) and only encodes characters that are genuinely illegal in a URL.

encodeURI("https://example.com/search?q=hello world&lang=en")
// → "https://example.com/search?q=hello%20world&lang=en"
// Note: ? & = are NOT encoded — they are part of the URL structure

Characters NOT encoded: A-Z a-z 0-9 ; , / ? : @ & = + $ - _ . ! ~ * ' ( ) #

encodeURIComponent: For Encoding a Parameter Value

encodeURIComponent encodes nearly all special characters, including ? & = # /, making it the right choice for encoding individual parameter values.

const keyword = "hello world & more"
const url = "https://example.com/search?q=" + encodeURIComponent(keyword)
// → "https://example.com/search?q=hello%20world%20%26%20more"
// & is encoded as %26 and will not interfere with parameter parsing

Characters NOT encoded: A-Z a-z 0-9 - _ . ! ~ * ' ( )

Side-by-Side Comparison

const str = "a=1&b=2/c?d=text"

encodeURI(str)
// → "a=1&b=2/c?d=text"
// = & / ? are all preserved

encodeURIComponent(str)
// → "a%3D1%26b%3D2%2Fc%3Fd%3Dtext"
// = & / ? are all encoded

The rule of thumb: use encodeURIComponent when building parameter values — you will almost never go wrong.


The Most Common Developer Mistakes

Trap 1: Space Encoded as + Instead of %20

HTML form submissions using application/x-www-form-urlencoded encode spaces as +, not %20. This is a legacy convention that differs from standard percent-encoding.

Form submission: hello world → hello+world
Standard URL encoding: hello world → hello%20world

Be aware of this distinction when parsing on the backend. PHP's urldecode() converts + back to a space, but rawurldecode() does not.

Trap 2: Encoding the Same Value Twice

// First encoding
const encoded = encodeURIComponent("hello world")
// → "hello%20world"

// Encoded again by mistake
const doubleEncoded = encodeURIComponent(encoded)
// → "hello%2520world"  (%25 is the encoding of %)

The backend receives %2520, decodes once to get %20, and needs a second decode to get the space. This double-encoding issue is one of the most common causes of API integration bugs.

Trap 3: encodeURI Does Not Encode #

# denotes a URL fragment (anchor), so encodeURI deliberately leaves it alone. If a parameter value contains #, you must use encodeURIComponent.

encodeURI("https://example.com?tag=#title")
// → "https://example.com?tag=#title"  # is NOT encoded!

"https://example.com?tag=" + encodeURIComponent("#title")
// → "https://example.com?tag=%23title"  correct

Trap 4: Decoding Functions Behave Differently Across Languages

# Python
from urllib.parse import unquote, unquote_plus
unquote("hello%20world")       # → "hello world"  (%20 → space)
unquote_plus("hello+world")    # → "hello world"  (+ → space)
// Java
URLDecoder.decode("hello+world", "UTF-8")    // → "hello world" (+ decoded as space)
URI.create("hello%20world").getPath()         // → "hello world" (standard decode)
urldecode("hello+world")      // → "hello world"
rawurldecode("hello+world")   // → "hello+world" (+ not decoded)
rawurldecode("hello%20world") // → "hello world"

Agree on an encoding convention between frontend and backend, and make sure decoding functions match accordingly.


Handling Non-ASCII Characters in URLs

When you type a URL containing non-ASCII characters (Chinese, Arabic, accented letters) into a browser address bar, the browser automatically encodes them before sending the request. For example, visiting:

https://example.com/search?q=中文

The actual request sent over the wire is:

https://example.com/search?q=%E4%B8%AD%E6%96%87

From an SEO perspective, non-ASCII URLs are well-supported by modern search engines. Just keep in mind that server logs and code will show the encoded form — remember to decode before reading.


Decode It Instantly

When you encounter an unreadable percent-encoded URL string, or need to encode Chinese characters or special symbols into URL-safe format, paste it into the URL Encoding/Decoding Tool on toolshu.com. It supports both encodeURI and encodeURIComponent modes — no setup needed.

发现周边 发现周边
Comment area

Loading...