GBK

Chinese national standard encoding for Simplified Chinese. Superset of GB2312. Variable-width: single-byte for ASCII, double-byte for Chinese characters. Dominant encoding for Simplified Chinese on Windows.

GBK
Variable (1–2 bytes)
1993

Byte Structure

GBK uses variable-width encoding (1–2 bytes per character). Characters not in this encoding cannot be represented and must be replaced or transliterated.

When to Use GBK

GBK is the standard encoding for Simplified Chinese on Windows and in legacy Chinese software. It's required when producing or consuming Simplified Chinese content for older Windows applications, legacy databases, or older Chinese websites. New systems targeting Chinese users should use UTF-8 or GB18030.

Sample Characters in GBK

The table below shows how a selection of characters are represented in GBK. Bytes are shown in hexadecimal. Characters marked "not supported" cannot be encoded in GBK and would need to be replaced or transliterated when converting from Unicode.

Character Codepoint Name Bytes (Hex) Bytes (Decimal) Supported
A U+0041 LATIN CAPITAL LETTER A 41 65 Yes
a U+0061 LATIN SMALL LETTER A 61 97 Yes
0 U+0030 DIGIT ZERO 30 48 Yes
$ U+0024 DOLLAR SIGN 24 36 Yes
£ U+00A3 POUND SIGN not supported
© U+00A9 COPYRIGHT SIGN not supported
U+20AC EURO SIGN not supported
α U+03B1 GREEK SMALL LETTER ALPHA A6 C1 166 193 Yes
А U+0410 CYRILLIC CAPITAL LETTER A A7 A1 167 161 Yes
U+4E2D D6 D0 214 208 Yes
U+3042 HIRAGANA LETTER A A4 A2 164 162 Yes
U+263A WHITE SMILING FACE not supported

Working with GBK in Code

Every major language has built-in support for encoding conversion. The examples below show how to encode a string to GBK bytes and decode it back to a Unicode string. Always specify the encoding explicitly — never rely on system defaults, which vary by OS and locale.

# Encode a string to gbk bytes
text = "Hello, 世界"
encoded = text.encode("GBK")

# Decode bytes back to a string
decoded = encoded.decode("GBK")
// Convert to gbk
$bytes = mb_convert_encoding(
    "Hello, 世界",
    "GBK",
    "UTF-8"
);

// Convert back to UTF-8
$text = mb_convert_encoding(
    $bytes,
    "UTF-8",
    "GBK"
);
// Encode to GBK bytes
const encoder = new TextEncoder(); // UTF-8
const bytes = encoder.encode("Hello, 世界");

// Decode bytes
const decoder = new TextDecoder("GBK");
const text = decoder.decode(bytes);
-- Create a database with GBK
CREATE DATABASE mydb
  ENCODING 'GBK'
  LC_COLLATE 'en_US.UTF-8';

-- Check database encoding
SELECT pg_encoding_to_char(encoding)
FROM pg_database
WHERE datname = current_database();

Compare with Other Encodings

See how GBK differs from other encodings — which characters each supports and how the byte representations compare.