Bidirectional Text: How Unicode Handles Arabic and Hebrew

· 2 min read

Most writing systems read left-to-right, but Arabic, Hebrew, Persian, and several other scripts read right-to-left. Text that mixes both directions — such as an English sentence containing an Arabic phrase, or an Arabic document with Latin brand names — requires special Unicode handling to display correctly.

The Unicode Bidirectional Algorithm

The Unicode Bidirectional Algorithm (UBA) defines how to determine the visual display order of characters based on their directionality properties. Each character has a Bidi Class: strongly left-to-right (like Latin letters), strongly right-to-left (like Arabic letters), neutrals (like spaces and punctuation), and special formatting characters. The algorithm resolves the display order by analyzing these classes in context.

Bidi Control Characters

Unicode includes a set of invisible control characters to override or refine the bidirectional algorithm. The Right-to-Left Override (U+202E) forces all subsequent characters to display right-to-left, which has legitimate uses in file names and product labels but can also be abused to hide malicious content. The Left-to-Right Mark (U+200E) and Right-to-Left Mark (U+200F) are zero-width characters that influence direction without being visible.

Bidi in HTML

HTML provides the dir attribute and the <bdi> and <bdo> elements to control text direction. The <bdi> (Bidirectional Isolation) element isolates a segment of text that may have different directionality from the surrounding content — useful when embedding user-generated content with unknown directionality.

Security: The Bidi Trojan Attack

The "Trojan Source" vulnerability (2021) showed how Bidi control characters can be embedded in source code to make code appear different to a human reviewer than to a compiler. A carefully placed RTL Override can make a comment look like executable code or vice versa. This is why code review tools and IDEs increasingly warn about Bidi control characters in source files. You can inspect Bidi properties of any character in our character browser.

More Articles

View all articles