ASCII

The original 7-bit character encoding standard. Covers 128 characters: English letters, digits, punctuation, and control codes. The foundation that UTF-8 was designed to be backwards-compatible with.

IANA Name

US-ASCII

Width

Fixed (1 byte)

Introduced

1963

Byte Structure

ASCII uses exactly 7 bits per character (stored in 1 byte with the high bit always 0). Values 0–31 and 127 are control characters; 32–126 are printable characters. The high bit (0x80) is always 0, which is why ASCII is a subset of UTF-8, Latin-1, and Windows-1252.

When to Use ASCII

ASCII is appropriate when you're certain your data contains only basic English text, digits, and common punctuation. Since every ASCII character maps 1:1 to the same byte in UTF-8, Latin-1, and Windows-1252, ASCII data is readable in all three without conversion. For any system that needs to handle names, addresses, or international content, ASCII is insufficient — use UTF-8 instead.

Sample Characters in ASCII

The table below shows how a selection of characters are represented in ASCII. Bytes are shown in hexadecimal. Characters marked "not supported" cannot be encoded in ASCII and would need to be replaced or transliterated when converting from Unicode.

Character	Codepoint	Name	Bytes (Hex)	Bytes (Decimal)	Supported
A	U+0041	LATIN CAPITAL LETTER A	41	65	Yes
a	U+0061	LATIN SMALL LETTER A	61	97	Yes
0	U+0030	DIGIT ZERO	30	48	Yes
$	U+0024	DOLLAR SIGN	24	36	Yes
£	U+00A3	POUND SIGN	not supported		—
©	U+00A9	COPYRIGHT SIGN	not supported		—
€	U+20AC	EURO SIGN	not supported		—
α	U+03B1	GREEK SMALL LETTER ALPHA	not supported		—
А	U+0410	CYRILLIC CAPITAL LETTER A	not supported		—
中	U+4E2D		not supported		—
あ	U+3042	HIRAGANA LETTER A	not supported		—
☺	U+263A	WHITE SMILING FACE	not supported		—

Working with ASCII in Code

Every major language has built-in support for encoding conversion. The examples below show how to encode a string to ASCII bytes and decode it back to a Unicode string. Always specify the encoding explicitly — never rely on system defaults, which vary by OS and locale.

Python

# Encode a string to ascii bytes
text = "Hello, 世界"
encoded = text.encode("US-ASCII")

# Decode bytes back to a string
decoded = encoded.decode("US-ASCII")

PHP

// Convert to ascii
$bytes = mb_convert_encoding(
    "Hello, 世界",
    "US-ASCII",
    "UTF-8"
);

// Convert back to UTF-8
$text = mb_convert_encoding(
    $bytes,
    "UTF-8",
    "US-ASCII"
);

JavaScript

// Encode to US-ASCII bytes
const encoder = new TextEncoder(); // UTF-8
const bytes = encoder.encode("Hello, 世界");

// Decode bytes
const decoder = new TextDecoder("US-ASCII");
const text = decoder.decode(bytes);

SQL (PostgreSQL)

-- Create a database with ASCII
CREATE DATABASE mydb
  ENCODING 'US-ASCII'
  LC_COLLATE 'en_US.UTF-8';

-- Check database encoding
SELECT pg_encoding_to_char(encoding)
FROM pg_database
WHERE datname = current_database();

Compare with Other Encodings

See how ASCII differs from other encodings — which characters each supports and how the byte representations compare.

ASCII vs UTF-8 → ASCII vs Latin-1 (ISO-8859-1) →

ASCII FAQ

Is ASCII a subset of UTF-8?

Yes. ASCII characters (codepoints 0–127) are encoded in UTF-8 as a single byte with the same byte value. Any valid ASCII file is simultaneously valid UTF-8. The high bit (bit 7) is always 0 for ASCII characters, which UTF-8 uses to distinguish single-byte sequences from multi-byte ones.

What characters does ASCII include?

ASCII includes 128 characters: 33 control characters (0–31 and 127), 10 digits (0–9), 26 uppercase and 26 lowercase Latin letters, and 32 punctuation and symbol characters. Accented letters, currency symbols (other than $), and all non-Latin scripts are not in ASCII.

Why is ASCII still in use today?

ASCII remains the lowest common denominator for interoperability. Most source code, configuration files, email headers, and protocol tokens use ASCII. Modern systems accept UTF-8 in an ASCII-compatible mode, meaning any ASCII content works without changes. For English-only text and programming syntax, ASCII is sufficient and universally supported.

← All Encodings Browse Characters →