Converting uint8_t to binary

Question

In class, we are learning how to create ppm6 files. We have a 2D array of uint8_t, and the class is asking us to use fwrite() to somehow convert this 2D array into bunch of binary characters that look like this:

ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿÿ\0ÿ

How does one go about doing this?

Someone in class is saying it's automatically going to do the converting thing which doesn't make sense to me at all. This is what I'm currently returning. color_array is a 2D array of uint8_t and each color_array[i] gives an array to three uint8_ts.

fwrite(color_array[i], sizeof(uint8_t), 3, output);

And my output is:

???????????????????

How do you read the output? With which program? It would be useful to look at the hex codes of the real data in there. — virolino, Oct 01 '19 at 06:09
The presentation of binary data in general depends on the software that generates the output. If you use a hex file viewer/editor it will be shown as hex values. If you use a standard editor (think "Notepad++") it will be interpreted in the coding you selected; just some examples: ANSI, UTF-8. If you use a shell or command interpreter it will be interpreted in the coding, too. From where comes the example output you show us? — the busybee, Oct 01 '19 at 06:12
Binary data when viewed as text is going to look like total trash most of the time. Use a hex-dump tool if you want to see what it actually contains. — tadman, Oct 01 '19 at 06:34

Marco Bonelli · Answer 1 · 2019-10-01T06:56:02.913

First of all, some basics.

Writing data "in binary form" doesn't mean anything special. The program just takes whatever it has in RAM and dumps it to a file. This means that something like this:

uint8_t x = 10;
fwrite(&x, sizeof(uint8_t), 1, some_file);

Will result in the binary value of 10 directly written into the opened some_file. Since the value 10 is stored in a uint_8 and sizeof(uint_8) is 1 (one byte), that call will end up writing exactly one byte. The binary representation (as one byte) of 10 is 00001010.

If, after writing a binary file, you try to open it using a text editor, or you try to print it to the terminal using something like cat, it will be interpreted as text based on the encoding that is configured in the text editor or terminal. This will most probably result in random and strange letters, because that binary data is just not meant to be representing some text, but just integer values of one byte. This can only be known by who created the file. The binary data itself can have infinite interpretations.

With that said, what you see there:

ÿ\0\0ÿ\0\0ÿ\0\0ÿ\0\0ÿ

Is exactly what I described. To be able to see the real values you should pass that output to an hex viewer, like the hd command for example:

$ hd your_file
00000000  c3 bf 00 00 c3 bf 00 00  c3 bf 00 00 c3 bf 00 00  |................|                                                                        
00000010  c3 bf                                             |..|                                                                                     
00000012

If that text was displayed using UTF-8 (most probably), the bytes that were written are the ones displayed above, which, converted from hexadecimal to decimal, are:

195 191 0 0 195 191 0 0 195 191 0 0 195 191 0 0 195 191

Now, talking about the actual code you posted:

What you are doing is right, assuming that your 2D array is Nx3. To write binary data to a file, fwrite() is the right function.

So, this:

fwrite(color_array[i], sizeof(uint8_t), 3, output);

Is writing the i-th row of the 2D array, which consists of 3 uint8_t values, to the file, in binary form.

As an example, consider the following:

uint8_t color_array[2][3] = {{1, 2, 3}, {10, 11, 12}};

fwrite(color_array[0], sizeof(uint8_t), 3, output);
fwrite(color_array[1], sizeof(uint8_t), 3, output);

This will make you end up with a file that looks like this:

$ hd your_file
00000000  01 02 03 0a 0b 0c                                 |......|                                                                                  
00000006

@YesThatIsMyName well, they are not. `hd` and `hexdump` are different programs. They are both hexadecimal viewers, if that's what you meant. — Marco Bonelli, Oct 01 '19 at 06:55
The manpage says: HEXDUMP(1) BSD General Commands Manual HEXDUMP(1) NAME hexdump, hd — ASCII, decimal, hexadecimal, octal dump — YesThatIsMyName, Oct 01 '19 at 06:56
@YesThatIsMyName that does not mean much, it's just a manual page for both commands. They take similar options, but are two different programs. Just as an example, `hd` (with no arguments) defaults to displaying *one*-byte hex values along with their ASCII representation, while `hexdump` displays *two*-byte hex values without their ASCII representation. — Marco Bonelli, Oct 01 '19 at 07:03
On my Ubuntu hd is just a link to hexdump, and hexdump the binary. — YesThatIsMyName, Oct 01 '19 at 07:10
@YesThatIsMyName that's a very common design pattern. When the program is invoked from the command line, it checks if `argv[0]` is equal to `"hd"` or to `"hexdump"`, and behaves differently according to which one of the two names was used to invoke it. It's basically two different programs embedded into one. — Marco Bonelli, Oct 01 '19 at 07:13
@YesThatIsMyName I don't think so. It really adds no value to the explanation and just creates confusion for an inexperienced user. This is not a question regarding hex viewers, I used `hd` only to illustrate some examples. — Marco Bonelli, Oct 01 '19 at 07:17

Converting uint8_t to binary

1 Answers1