Base32 Characters, Alphabet, Table, and Padding Explained

Base32 Characters, Alphabet, Table, and Padding Explained

In this article we will learn in detail about the Base32 characters, or alphabet, in table form. We will show the differences between the characters of each Base32 variation and even explain the meaning and function of padding.

What is Base32?

Base32 is fundamentally a binary-to-text encoding technique that enables the representation of binary data in a sanely readable format. It use a set of 32 characters to accomplish this. In contrast to raw binary data, which can include unusual or non-printable characters that can cause problems during transmission or storage.

Base32 Breakdown:

  • Binary Shrinkwrap: Takes binary data (like codes) and turns it into a shorter text format using letters (A-Z) and numbers (2-7).
  • URL Friendly: Avoids symbols that cause problems in web addresses, making it useful for embedding data in URLs.
  • Not Encryption: Anyone can decode the Base32 text back to the original data. Think of it as a different language, not a secret message.

Base32 Character Set

The character set is the key to Base32. This encoding technique employs a limited number of 32 characters, which include uppercase letters A through Z and numerals 2–7. This character set was carefully constructed to eliminate ambiguity and ensure that encoded data may be correctly decoded.

Characters in Base32 encoding are classified into distinct categories based on their indices within the character set. These are some of the categories:

Uppercase Letters (Indices 0-25): A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z

Numbers (Indices 26-31): 2, 3, 4, 5, 6, 7

Base32 Table

Below you can see a table of Base32 characters. This table shows the mapping between the original binary values and their corresponding Base32 characters. Each group of 5 binary bits (representing a tiny chunk of data) gets converted into a single Base32 character using this table.

BinaryDecimalBase32
000000A
000011B
000102C
000113D
001004E
001015F
001106G
001117H
010008I
010019J
0101010K
0101111L
0110012M
0110113N
0111014O
0111115P
1000016Q
1000117R
1001018S
1001119T
1010020U
1010121V
1011022W
1011123X
1100024Y
1100125Z
11010262
11011273
11100284
11101295
11110306
11111317

Base32 Padding

Padding in the context of Base32 refers to the addition of extra characters, usually “=” (equals signs), at the end of the encoded data to ensure that the encoded data’s length is a multiple of 8 characters. In Base32, each character represents 5 bits of data, so it’s common to have leftover bits when encoding binary data that is not an exact multiple of 5 bits in length. Padding is used to fill in these leftover bits, making the encoded data’s length a multiple of 40 bits (5 bits per character times 8 characters).

Let’s break down how padding works in Base32 encoding:

  • If the last 5-byte block contains 1 byte of input data, add 4 zero bytes. Then, encode it as a regular block and replace the last 6 characters with 6 equal signs (======).
  • If the last 5-byte block holds 2 bytes of input data, add 3 zero bytes. After encoding it as a standard block, change the last 4 characters to 4 equal signs (====).
  • If the last 5-byte block includes 3 bytes of input data, add 2 zero bytes. Encode it as a regular block and replace the last 3 characters with 3 equal signs (===).
  • If the last 5-byte block has 4 bytes of input data, add 1 zero byte. After encoding it as a normal block, replace the last 1 character with 1 equal sign (=).

This is why padding is necessary:

  1. Data Length Consistency: Without padding, the length of Base32-encoded data may not be consistent, making it challenging to determine the original data’s size during decoding. Padding ensures that the encoded data always has a length that is a multiple of 40 bits, simplifying the decoding process.
  2. Alignment: Padding aligns the encoded data to fixed boundaries, making it easier to work with and ensuring that the data can be decoded correctly.
  3. Error Detection: By including padding characters, Base32 encoding can incorporate basic error-checking functionality. If the decoded data does not end with the expected padding characters, it may indicate a data integrity issue.

Differences Between Base32, Z-Base32, Base32Hex and Crockford’s Base32 Character Sets

The table below shows the differences between the character sets of each Base32 variant. The table shows the Base32, Z-Base32, Base32Hex and Crockford’s Base32 encoding methods.

Base32Z-Base32Base32HexCrockford’s Base32
Ay00
Bb11
Cn22
Dd33
Er44
Ff55
Gg66
H877
Ie88
Jj99
KkAA
LmBB
McCC
NpDD
OqEE
PxFF
QoGG
RtHH
S1IJ
TuJK
UwKM
ViLN
WsMP
XzNQ
YaOR
Z3PS
24QT
35RV
4hSW
57TX
66UY
79VZ