UTF-8 vs UTF-32 LE

UTF-8

Unicode

The dominant encoding for the web. Variable-width (1–4 bytes). Fully backwards-compatible with ASCII. The default encoding for HTML5, JSON, and most modern protocols.

UTF-8
1–4 bytes
EF BB BF
1993

UTF-32 LE

Unicode

Fixed-width encoding using 4 bytes per character. Simple to process but memory-inefficient. Little-endian byte order.

UTF-32LE
4–4 bytes
FF FE 00 00
2003

Both UTF-8 and UTF-32 LE are Unicode encodings — they represent the same 138,571 characters. The difference is in how the bytes are arranged. The table below shows how the first 128 codepoints (the ASCII range) are encoded in each.

ASCII Range (U+0000–U+007F)

Char Codepoint UTF-8 UTF-32 LE Same?
U+0000 00 00 00 00 00
U+0001 01 01 00 00 00
U+0002 02 02 00 00 00
U+0003 03 03 00 00 00
U+0004 04 04 00 00 00
U+0005 05 05 00 00 00
U+0006 06 06 00 00 00
U+0007 07 07 00 00 00
U+0008 08 08 00 00 00
U+0009 09 09 00 00 00
U+000A 0A 0A 00 00 00
U+000B 0B 0B 00 00 00
U+000C 0C 0C 00 00 00
U+000D 0D 0D 00 00 00
U+000E 0E 0E 00 00 00
U+000F 0F 0F 00 00 00
U+0010 10 10 00 00 00
U+0011 11 11 00 00 00
U+0012 12 12 00 00 00
U+0013 13 13 00 00 00
U+0014 14 14 00 00 00
U+0015 15 15 00 00 00
U+0016 16 16 00 00 00
U+0017 17 17 00 00 00
U+0018 18 18 00 00 00
U+0019 19 19 00 00 00
U+001A 1A 1A 00 00 00
U+001B 1B 1B 00 00 00
U+001C 1C 1C 00 00 00
U+001D 1D 1D 00 00 00
U+001E 1E 1E 00 00 00
U+001F 1F 1F 00 00 00
U+0020 20 20 00 00 00
! U+0021 21 21 00 00 00
" U+0022 22 22 00 00 00
# U+0023 23 23 00 00 00
$ U+0024 24 24 00 00 00
% U+0025 25 25 00 00 00
& U+0026 26 26 00 00 00
' U+0027 27 27 00 00 00
( U+0028 28 28 00 00 00
) U+0029 29 29 00 00 00
* U+002A 2A 2A 00 00 00
+ U+002B 2B 2B 00 00 00
, U+002C 2C 2C 00 00 00
- U+002D 2D 2D 00 00 00
. U+002E 2E 2E 00 00 00
/ U+002F 2F 2F 00 00 00
0 U+0030 30 30 00 00 00
1 U+0031 31 31 00 00 00
2 U+0032 32 32 00 00 00
3 U+0033 33 33 00 00 00
4 U+0034 34 34 00 00 00
5 U+0035 35 35 00 00 00
6 U+0036 36 36 00 00 00
7 U+0037 37 37 00 00 00
8 U+0038 38 38 00 00 00
9 U+0039 39 39 00 00 00
: U+003A 3A 3A 00 00 00
; U+003B 3B 3B 00 00 00
< U+003C 3C 3C 00 00 00
= U+003D 3D 3D 00 00 00
> U+003E 3E 3E 00 00 00
? U+003F 3F 3F 00 00 00
@ U+0040 40 40 00 00 00
A U+0041 41 41 00 00 00
B U+0042 42 42 00 00 00
C U+0043 43 43 00 00 00
D U+0044 44 44 00 00 00
E U+0045 45 45 00 00 00
F U+0046 46 46 00 00 00
G U+0047 47 47 00 00 00
H U+0048 48 48 00 00 00
I U+0049 49 49 00 00 00
J U+004A 4A 4A 00 00 00
K U+004B 4B 4B 00 00 00
L U+004C 4C 4C 00 00 00
M U+004D 4D 4D 00 00 00
N U+004E 4E 4E 00 00 00
O U+004F 4F 4F 00 00 00
P U+0050 50 50 00 00 00
Q U+0051 51 51 00 00 00
R U+0052 52 52 00 00 00
S U+0053 53 53 00 00 00
T U+0054 54 54 00 00 00
U U+0055 55 55 00 00 00
V U+0056 56 56 00 00 00
W U+0057 57 57 00 00 00
X U+0058 58 58 00 00 00
Y U+0059 59 59 00 00 00
Z U+005A 5A 5A 00 00 00
[ U+005B 5B 5B 00 00 00
\ U+005C 5C 5C 00 00 00
] U+005D 5D 5D 00 00 00
^ U+005E 5E 5E 00 00 00
_ U+005F 5F 5F 00 00 00
` U+0060 60 60 00 00 00
a U+0061 61 61 00 00 00
b U+0062 62 62 00 00 00
c U+0063 63 63 00 00 00
d U+0064 64 64 00 00 00
e U+0065 65 65 00 00 00
f U+0066 66 66 00 00 00
g U+0067 67 67 00 00 00
h U+0068 68 68 00 00 00
i U+0069 69 69 00 00 00
j U+006A 6A 6A 00 00 00
k U+006B 6B 6B 00 00 00
l U+006C 6C 6C 00 00 00
m U+006D 6D 6D 00 00 00
n U+006E 6E 6E 00 00 00
o U+006F 6F 6F 00 00 00
p U+0070 70 70 00 00 00
q U+0071 71 71 00 00 00
r U+0072 72 72 00 00 00
s U+0073 73 73 00 00 00
t U+0074 74 74 00 00 00
u U+0075 75 75 00 00 00
v U+0076 76 76 00 00 00
w U+0077 77 77 00 00 00
x U+0078 78 78 00 00 00
y U+0079 79 79 00 00 00
z U+007A 7A 7A 00 00 00
{ U+007B 7B 7B 00 00 00
| U+007C 7C 7C 00 00 00
} U+007D 7D 7D 00 00 00
~ U+007E 7E 7E 00 00 00
U+007F 7F 7F 00 00 00