Zero-Width Characters: Invisible but Surprisingly Powerful

· 2 min read

Zero-width characters are Unicode code points that take up no horizontal space when rendered. They're invisible in most contexts yet can have significant effects on text layout, line breaking, text direction, and — unfortunately — security. Understanding them is useful for anyone working with internationalized text.

Common Zero-Width Characters

The most common zero-width characters include the Zero Width Space (U+200B), which suggests a possible line break without inserting a visible space; the Zero Width Non-Joiner (U+200C), which prevents ligature formation and joining in Arabic/Persian script; the Zero Width Joiner (U+200D), which forces joining and is used extensively in emoji sequences (e.g., family emoji are composed with ZWJ); and the Word Joiner (U+2060), which prevents line breaks at a position.

Zero-Width Characters in Emoji

The ZWJ (U+200D) is the secret behind many complex emoji. The 👨‍👩‍👧 family emoji is actually three separate emoji joined by ZWJ characters: 👨 + ZWJ + 👩 + ZWJ + 👧. Systems that don't support the sequence display the component emoji separately; supporting systems render them as a single combined image. These characters appear in many characters you can explore in our character browser.

Security Implications

Zero-width characters can be used maliciously. Hidden text can be embedded in documents by inserting zero-width characters between visible characters. Source code can be manipulated with invisible characters that change program behavior — a technique known as a Trojan Source attack. Text that appears identical visually may differ when copied, pasted, or processed programmatically.

Detection and Removal

Paste suspicious text into our text encoder to see all code points in the string, including invisible ones. Zero-width characters will appear as named entries despite contributing no visible width.

More Articles

View all articles