Shift-JIS
Variable-width encoding for Japanese. Single-byte for ASCII and half-width kana, double-byte for kanji and full-width characters. The dominant encoding for Japanese on Windows and in legacy Japanese software.
Byte Structure
Shift-JIS uses variable-width encoding (1–2 bytes per character). Characters not in this encoding cannot be represented and must be replaced or transliterated.
When to Use Shift-JIS
Shift-JIS is the encoding for Japanese text in legacy Windows applications, older Japanese websites, and many Japanese file formats. If you're processing Japanese data from a source that pre-dates widespread UTF-8 adoption — or working with Japanese game ROMs and legacy software — you'll need Shift-JIS support. New Japanese systems should use UTF-8.
Sample Characters in Shift-JIS
The table below shows how a selection of characters are represented in Shift-JIS. Bytes are shown in hexadecimal. Characters marked "not supported" cannot be encoded in Shift-JIS and would need to be replaced or transliterated when converting from Unicode.
| Character | Codepoint | Name | Bytes (Hex) | Bytes (Decimal) | Supported |
|---|---|---|---|---|---|
| A | U+0041 | LATIN CAPITAL LETTER A | 41 | 65 | Yes |
| a | U+0061 | LATIN SMALL LETTER A | 61 | 97 | Yes |
| 0 | U+0030 | DIGIT ZERO | 30 | 48 | Yes |
| $ | U+0024 | DOLLAR SIGN | 24 | 36 | Yes |
| £ | U+00A3 | POUND SIGN | 81 92 | 129 146 | Yes |
| © | U+00A9 | COPYRIGHT SIGN | not supported | — | |
| € | U+20AC | EURO SIGN | not supported | — | |
| α | U+03B1 | GREEK SMALL LETTER ALPHA | 83 BF | 131 191 | Yes |
| А | U+0410 | CYRILLIC CAPITAL LETTER A | 84 40 | 132 64 | Yes |
| 中 | U+4E2D | 92 86 | 146 134 | Yes | |
| あ | U+3042 | HIRAGANA LETTER A | 82 A0 | 130 160 | Yes |
| ☺ | U+263A | WHITE SMILING FACE | not supported | — |
Working with Shift-JIS in Code
Every major language has built-in support for encoding conversion. The examples below show how to encode a string to Shift-JIS bytes and decode it back to a Unicode string. Always specify the encoding explicitly — never rely on system defaults, which vary by OS and locale.
# Encode a string to shift-jis bytes
text = "Hello, 世界"
encoded = text.encode("Shift_JIS")
# Decode bytes back to a string
decoded = encoded.decode("Shift_JIS")
// Convert to shift-jis
$bytes = mb_convert_encoding(
"Hello, 世界",
"Shift_JIS",
"UTF-8"
);
// Convert back to UTF-8
$text = mb_convert_encoding(
$bytes,
"UTF-8",
"Shift_JIS"
);
// Encode to Shift_JIS bytes
const encoder = new TextEncoder(); // UTF-8
const bytes = encoder.encode("Hello, 世界");
// Decode bytes
const decoder = new TextDecoder("Shift_JIS");
const text = decoder.decode(bytes);
-- Create a database with Shift-JIS
CREATE DATABASE mydb
ENCODING 'Shift_JIS'
LC_COLLATE 'en_US.UTF-8';
-- Check database encoding
SELECT pg_encoding_to_char(encoding)
FROM pg_database
WHERE datname = current_database();
Compare with Other Encodings
See how Shift-JIS differs from other encodings — which characters each supports and how the byte representations compare.
Shift-JIS FAQ
What is the difference between Shift-JIS and EUC-JP?
Both encode Japanese (hiragana, katakana, kanji) but use different byte structures. Shift-JIS uses lead bytes 0x81–0x9F and 0xE0–0xFC for double-byte characters, with single-byte katakana in 0xA1–0xDF. EUC-JP uses 0xA1–0xFE for two-byte kanji and 0x8E as a prefix for katakana. Shift-JIS dominated Windows; EUC-JP dominated Unix systems.
Is Shift-JIS still in use?
Shift-JIS is still found in Japanese game ROMs, legacy Windows software, older Japanese websites, and some industrial and point-of-sale systems. Modern Japanese applications use UTF-8. When processing legacy Japanese data, always specify or detect the encoding explicitly — Shift-JIS and UTF-8 byte sequences can be ambiguous without context.