ASCII

The original 7-bit character encoding standard. Covers 128 characters: English letters, digits, punctuation, and control codes. The foundation that UTF-8 was designed to be backwards-compatible with.

US-ASCII
Fixed (1 byte)
1963

Byte Structure

ASCII uses exactly 7 bits per character (stored in 1 byte with the high bit always 0). Values 0–31 and 127 are control characters; 32–126 are printable characters. The high bit (0x80) is always 0, which is why ASCII is a subset of UTF-8, Latin-1, and Windows-1252.

When to Use ASCII

ASCII is appropriate when you're certain your data contains only basic English text, digits, and common punctuation. Since every ASCII character maps 1:1 to the same byte in UTF-8, Latin-1, and Windows-1252, ASCII data is readable in all three without conversion. For any system that needs to handle names, addresses, or international content, ASCII is insufficient — use UTF-8 instead.

Sample Characters in ASCII

The table below shows how a selection of characters are represented in ASCII. Bytes are shown in hexadecimal. Characters marked "not supported" cannot be encoded in ASCII and would need to be replaced or transliterated when converting from Unicode.

Character Codepoint Name Bytes (Hex) Bytes (Decimal) Supported
A U+0041 LATIN CAPITAL LETTER A 41 65 Yes
a U+0061 LATIN SMALL LETTER A 61 97 Yes
0 U+0030 DIGIT ZERO 30 48 Yes
$ U+0024 DOLLAR SIGN 24 36 Yes
£ U+00A3 POUND SIGN not supported
© U+00A9 COPYRIGHT SIGN not supported
U+20AC EURO SIGN not supported
α U+03B1 GREEK SMALL LETTER ALPHA not supported
А U+0410 CYRILLIC CAPITAL LETTER A not supported
U+4E2D not supported
U+3042 HIRAGANA LETTER A not supported
U+263A WHITE SMILING FACE not supported

Working with ASCII in Code

Every major language has built-in support for encoding conversion. The examples below show how to encode a string to ASCII bytes and decode it back to a Unicode string. Always specify the encoding explicitly — never rely on system defaults, which vary by OS and locale.

# Encode a string to ascii bytes
text = "Hello, 世界"
encoded = text.encode("US-ASCII")

# Decode bytes back to a string
decoded = encoded.decode("US-ASCII")
// Convert to ascii
$bytes = mb_convert_encoding(
    "Hello, 世界",
    "US-ASCII",
    "UTF-8"
);

// Convert back to UTF-8
$text = mb_convert_encoding(
    $bytes,
    "UTF-8",
    "US-ASCII"
);
// Encode to US-ASCII bytes
const encoder = new TextEncoder(); // UTF-8
const bytes = encoder.encode("Hello, 世界");

// Decode bytes
const decoder = new TextDecoder("US-ASCII");
const text = decoder.decode(bytes);
-- Create a database with ASCII
CREATE DATABASE mydb
  ENCODING 'US-ASCII'
  LC_COLLATE 'en_US.UTF-8';

-- Check database encoding
SELECT pg_encoding_to_char(encoding)
FROM pg_database
WHERE datname = current_database();

Compare with Other Encodings

See how ASCII differs from other encodings — which characters each supports and how the byte representations compare.

ASCII FAQ

Is ASCII a subset of UTF-8?

Yes. ASCII characters (codepoints 0–127) are encoded in UTF-8 as a single byte with the same byte value. Any valid ASCII file is simultaneously valid UTF-8. The high bit (bit 7) is always 0 for ASCII characters, which UTF-8 uses to distinguish single-byte sequences from multi-byte ones.

What characters does ASCII include?

ASCII includes 128 characters: 33 control characters (0–31 and 127), 10 digits (0–9), 26 uppercase and 26 lowercase Latin letters, and 32 punctuation and symbol characters. Accented letters, currency symbols (other than $), and all non-Latin scripts are not in ASCII.

Why is ASCII still in use today?

ASCII remains the lowest common denominator for interoperability. Most source code, configuration files, email headers, and protocol tokens use ASCII. Modern systems accept UTF-8 in an ASCII-compatible mode, meaning any ASCII content works without changes. For English-only text and programming syntax, ASCII is sufficient and universally supported.