GBK
Chinese national standard encoding for Simplified Chinese. Superset of GB2312. Variable-width: single-byte for ASCII, double-byte for Chinese characters. Dominant encoding for Simplified Chinese on Windows.
Byte Structure
GBK uses variable-width encoding (1–2 bytes per character). Characters not in this encoding cannot be represented and must be replaced or transliterated.
When to Use GBK
GBK is the standard encoding for Simplified Chinese on Windows and in legacy Chinese software. It's required when producing or consuming Simplified Chinese content for older Windows applications, legacy databases, or older Chinese websites. New systems targeting Chinese users should use UTF-8 or GB18030.
Sample Characters in GBK
The table below shows how a selection of characters are represented in GBK. Bytes are shown in hexadecimal. Characters marked "not supported" cannot be encoded in GBK and would need to be replaced or transliterated when converting from Unicode.
| Character | Codepoint | Name | Bytes (Hex) | Bytes (Decimal) | Supported |
|---|---|---|---|---|---|
| A | U+0041 | LATIN CAPITAL LETTER A | 41 | 65 | Yes |
| a | U+0061 | LATIN SMALL LETTER A | 61 | 97 | Yes |
| 0 | U+0030 | DIGIT ZERO | 30 | 48 | Yes |
| $ | U+0024 | DOLLAR SIGN | 24 | 36 | Yes |
| £ | U+00A3 | POUND SIGN | not supported | — | |
| © | U+00A9 | COPYRIGHT SIGN | not supported | — | |
| € | U+20AC | EURO SIGN | not supported | — | |
| α | U+03B1 | GREEK SMALL LETTER ALPHA | A6 C1 | 166 193 | Yes |
| А | U+0410 | CYRILLIC CAPITAL LETTER A | A7 A1 | 167 161 | Yes |
| 中 | U+4E2D | D6 D0 | 214 208 | Yes | |
| あ | U+3042 | HIRAGANA LETTER A | A4 A2 | 164 162 | Yes |
| ☺ | U+263A | WHITE SMILING FACE | not supported | — |
Working with GBK in Code
Every major language has built-in support for encoding conversion. The examples below show how to encode a string to GBK bytes and decode it back to a Unicode string. Always specify the encoding explicitly — never rely on system defaults, which vary by OS and locale.
# Encode a string to gbk bytes
text = "Hello, 世界"
encoded = text.encode("GBK")
# Decode bytes back to a string
decoded = encoded.decode("GBK")
// Convert to gbk
$bytes = mb_convert_encoding(
"Hello, 世界",
"GBK",
"UTF-8"
);
// Convert back to UTF-8
$text = mb_convert_encoding(
$bytes,
"UTF-8",
"GBK"
);
// Encode to GBK bytes
const encoder = new TextEncoder(); // UTF-8
const bytes = encoder.encode("Hello, 世界");
// Decode bytes
const decoder = new TextDecoder("GBK");
const text = decoder.decode(bytes);
-- Create a database with GBK
CREATE DATABASE mydb
ENCODING 'GBK'
LC_COLLATE 'en_US.UTF-8';
-- Check database encoding
SELECT pg_encoding_to_char(encoding)
FROM pg_database
WHERE datname = current_database();
Compare with Other Encodings
See how GBK differs from other encodings — which characters each supports and how the byte representations compare.