How to Encode and Decode Base32 in C Programming Language

Whether you’re just learning the basics of Base32 and C, or are an experienced programmer looking to expand your knowledge, this comprehensive guide is designed to meet your needs, covering Base32 encoding and decoding in the C programming environment.

What is Base32?

Base32 is a binary-to-text encoding scheme that represents binary data in an ASCII string format by translating it into a radix-32 (base-32) representation. The base-32 encoding uses a set of 32 characters, consisting of letters and digits.

The main advantage of using base-32 encoding is that it results in a more compact representation compared to other encoding schemes like base64, while still being case-insensitive and safe for use in URLs, filenames, and other situations where a limited character set is preferred. This makes base32 a popular choice for applications that require human-readable, easily transcribable, and compact encoding of binary data.

There are several variants of base-32, including RFC 4648 (also known as Crockford’s Base32) and z-base-32. These variants use different character sets and may have additional features, such as error detection or improved readability.

What is C Programming Language? Does it Natively Support Base32?

Dennis Ritchie created the imperative, general-purpose programming language C at Bell Laboratories in the 1970s. It is a flexible language that can be applied to many different tasks, such as web development, game creation, and system programming.

Here’s a breakdown of the C programming language:

Veteran of the Industry: Dennis Ritchie developed C in the 1970s, and it is still widely influential today.
Building Blocks of Computing: Used to develop operating systems (such as Linux), embedded systems, and fundamental software functionality.
Powerful & Efficient: Provides fine control over memory and hardware, making it perfect for applications that require performance.
Procedural Language: Focuses on step-by-step instructions, giving programmers a clear structure.
Foundation for Others: C’s fundamentals influenced numerous popular languages, including C++, Java, and Python.
Learning Curve: While powerful, C’s low-level nature can make it more difficult for newcomers to learn.

C, as a programming language, does not natively support Base32 encoding and decoding. Unlike some high-level languages that have built-in support for various data encoding schemes, C requires developers to implement Base32 functionality manually.

Base32 Libraries for C

Before we move on to the manual implementation, we don’t necessarily have to go down the path of coding the Base32 implementation ourselves. We can use libraries.

External libraries are collections of pre-written code that can be used to expand the capabilities of a programming language. They are usually written by third-party developers and distributed for free or for a fee.

There are also several libraries available for encoding and decoding Base32 in C. Here are a few options:

chromium/base32.c: This library is part of the Chromium OS platform and includes a Base32 encoding/decoding implementation.
mjg59/tpmtotp: This library is part of the TPM-totp project and includes a Base32 encoding/decoding implementation.

Own Base32 Implementation

If the use of external libraries is not possible, or if you want to have full control over the encoding process, implementing your own Base32 encoder is a possible solution.

In this section, we will first review the basics of Base32 and then show you how to start developing your own Base32 encoder and decoder, which of course can also be implemented in C.

Base32 Characters

To understand how Base32 works, we must first become familiar with its character set, which is required for its execution. The Base32 character set is composed of 32 characters, which commonly include:

Uppercase letters A to Z (26 characters)
Digits 2 to 7 (6 characters)

Unlike Base64, Base32 does not include special characters that might cause issues in certain contexts, making it suitable for use in URLs and filenames without the need for URL encoding.

How Base32 works?

Here are the steps to take when encoding Base32:

Divide the binary data into 40-bit (8-character) chunks. If the data’s length is not a multiple of 40 bits, you will need to handle padding.
By mapping each 5-bit segment to a character from the character set, you may convert each 40-bit chunk into its equal Base32 representation.
Add padding characters as needed to guarantee that the output is a multiple of eight characters.
The result is a Base32-encoded string, ready for transmission, storage, or further use.

During decoding, we must perform the same processes in reverse order.

Language Independent Base32 Implementation (Pseudo Code)

Now we explain the Base32 encoding implementation with pseudo code, which you can easily implement in the C programming language.

function base32_encode(data):
    // Define the Base32 alphabet
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567'
 
    // Convert the input data to binary
    binary_data = ''.join(format(ord(char), '08b') for char in data)
 
    // Pad the binary data with zeros if necessary
    while len(binary_data) % 5 != 0:
        binary_data += '0'
 
    // Split the binary data into 5-bit chunks
    chunks = [binary_data[i:i+5] for i in range(0, len(binary_data), 5)]
 
    // Convert each 5-bit chunk to its corresponding Base32 character
    encoded_data = ''.join([alphabet[int(chunk, 2)] for chunk in chunks])
 
    return encoded_data

This implementation accepts a string of data as input and returns the Base32-encoded string. It works by converting the incoming data to binary and then dividing it into 5-bit pieces. Each 5-bit chunk is then transformed to a Base32 character using the Base32 alphabet. Finally, a Base32-encoded string is returned.

Note that this implementation assumes that the input data is a string of characters. If the input data is in a different format, such as a byte array, the implementation would need to be modified accordingly.

Finally, let’s look at the decoding process:

function base32_decode(encoded_data):
    // Define the Base32 alphabet
    alphabet = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ234567'
 
    // Convert the encoded data to binary
    binary_data = ''.join([format(alphabet.index(char), '05b') for char in encoded_data])
 
    // Pad the binary data with zeros if necessary
    while len(binary_data) % 8 != 0:
        binary_data += '0'
 
    // Split the binary data into 8-bit chunks
    chunks = [binary_data[i:i+8] for i in range(0, len(binary_data), 8)]
 
    // Convert each 8-bit chunk to its corresponding ASCII character
    decoded_data = ''.join([chr(int(chunk, 2)) for chunk in chunks])
 
    return decoded_data