What is Base32: Encoding, Decoding, Characters and Alternatives

What is Base32: Encoding, Decoding, Characters and Alternatives

This blog post will delve into the intricacies of Base32, its character set, and the process of encoding and decoding. We’ll also explore some alternatives to Base32, such as Z-Base32, Base32Hex, and Crockford’s Base32, providing a comprehensive understanding of these fascinating data representation methods.

What is Base32?

Base32 encoding serves as a valuable binary-to-text encoding technique, allowing for the representation of binary data in a more readable and manageable format using a set of 32 characters. Unlike raw binary data, which can include unusual or non-printable characters that can cause problems during transmission or storage, Base32 encoding ensures that data remains compatible and intact.

By converting binary data into a set of 32 printable characters, Base32 encoding helps avoid issues caused by non-printable characters present in raw binary data. This encoding scheme ensures that data can be transmitted or stored more reliably in systems that may have difficulties handling raw binary data. As a result, Base32 encoding is widely used in various applications, including data transfer, data encoding, and creating human-readable representations of binary data.

Here’s a breakdown about Base32:

  • Binary to Text: Converts streams of 0s and 1s (binary data) into a shorter, text-based string using letters and numbers.
  • More Compact: Compared to Base64, Base32 results in a slightly smaller encoded string due to using fewer characters.
  • URL-Friendly: The character set avoids symbols like “+” and “/” that might cause issues in URLs, making it suitable for embedding data in web addresses.
  • Less Common: Not as widely used as Base64, so decoding compatibility might be more limited.
  • Security Note: Similar to Base64, Base32 encoding doesn’t encrypt data. Anyone can decode it back to the original binary form.

Base32 Character Set

The key to Base32 lies in its character set. This encoding scheme uses a specific set of 32 characters, which include uppercase letters A through Z and the digits 2 through 7. This character set was designed with careful consideration to avoid ambiguity and ensure that the encoded data can be accurately decoded.

Here are the 32 characters used in Base32:

BinaryDecimalBase32
000000A
000011B
000102C
000113D
001004E
001015F
001106G
001117H
010008I
010019J
0101010K
0101111L
0110012M
0110113N
0111014O
0111115P
1000016Q
1000117R
1001018S
1001119T
1010020U
1010121V
1011022W
1011123X
1100024Y
1100125Z
11010262
11011273
11100284
11101295
11110306
11111317

How to Encode and Decode Using Base32

Encoding data with Base32 is a straightforward process. You take your binary data and convert it into a series of 5-bit chunks. Each 5-bit chunk is then mapped to the corresponding character from the Base32 character set. This results in a text string that is safe for transmission and storage.

Decoding Base32 is the reverse process. You take the Base32-encoded text and convert it back into its original binary form, using the character set as a reference.

Encoding Example

Let’s say you have the binary data: 010000100011001100110010, and you want to encode it in Base32. Here’s how it’s done:

  1. Divide the binary data into 5-bit chunks:
    • 01000, 01000, 11001, 10011, 0010
  2. Map each 5-bit chunk to the corresponding Base32 character:
    • I, I, Z, T, E

So, the Base32-encoded representation of the binary data is IIZTE===.

If you’re wondering how those few equals signs ended up at the end of the result, here’s a little help:

  • If the last 5-byte block contains 1 byte of input data, add 4 zero bytes. Then, encode it as a regular block and replace the last 6 characters with 6 equal signs (======).
  • If the last 5-byte block holds 2 bytes of input data, add 3 zero bytes. After encoding it as a standard block, change the last 4 characters to 4 equal signs (====).
  • If the last 5-byte block includes 3 bytes of input data, add 2 zero bytes. Encode it as a regular block and replace the last 3 characters with 3 equal signs (===).
  • If the last 5-byte block has 4 bytes of input data, add 1 zero byte. After encoding it as a normal block, replace the last 1 character with 1 equal sign (=).

Decoding Example

Now, let’s reverse the process and decode the Base32 string IIZTE=== back into binary data:

  1. Delete the equals signs: IIZTE
  2. Map each Base32 character to its 5-bit binary equivalent:
    • I: 01000
    • I: 01000
    • Z: 11001
    • T: 10011
    • E: 00100
  3. Concatenate the binary values to get the original binary data:
    • 0100001000110011001100100

And there you have it – the binary data has been successfully decoded.

Base32 Alternatives: Z-Base32, Base32Hex, Crockford’s Base32

The Base32 is a well-established method with its own character set and encoding rules. However, there are alternative encoding schemes that cater to specific needs and use cases. In this chapter, we will delve into three noteworthy Base32 alternatives: Z-Base32, Base32Hex, and Crockford’s Base32.

Z-Base32, is an alternative encoding scheme with a few key distinctions from traditional Base32. It was designed to improve human readability while maintaining data integrity. It has an extended character set:Z-Base32 uses a character set that includes lowercase letters and numbers.

Base32Hex is another variant of Base32 encoding that primarily differs in its character set. It uses a character set that includes only the digits 0-9 and the uppercase letters A through V, which simplifies the encoding process and increases data density.

Crockford’s Base32, created by Douglas Crockford, is an encoding scheme optimized for use in URLs, filenames, and case-insensitive environments. It uses a character set consisting of ten digits and twenty-two letters (excluding ‘I’, ‘L’, ‘O’, and ‘U’ to avoid visual ambiguity).

Here is a concrete comparison of the character sets of the four variations.

Base32Z-Base32Base32HexCrockford’s Base32
Ay00
Bb11
Cn22
Dd33
Er44
Ff55
Gg66
H877
Ie88
Jj99
KkAA
LmBB
McCC
NpDD
OqEE
PxFF
QoGG
RtHH
S1IJ
TuJK
UwKM
ViLN
WsMP
XzNQ
YaOR
Z3PS
24QT
35RV
4hSW
57TX
66UY
79VZ

Base32 Examples

Here is some text and its Base32 encoded version to give you a better idea of what Base64 is and how it changes the visual representation of the data.

DataBase32 Encoded
Hello, World!JBSWY3DPFQQFO33SNRSCC===
Base32IJQXGZJTGI======
123456789GEZDGNBVGY3TQOI=
testORSXG5A=