Storing individual bits in memory

Question

So I want to store random bits of length 1 to 8 (a BYTE) in memory. I know that computer aren't efficient enough to store individual bits in memory and that we must store at least a BYTE data on most modern machines. I have been doing some research on this but haven't come across any useful material. I need to find a way to store these bits so that, for example, when reading the bits back from the memory, 0 must NOT be evaluated as 00, 0000 or 00000000. To further explain, for example, 010 must NOT be read back or evaluated for that matter, as 00000010. Numbers should be unique based on the value as well as their cardinality.

Some more examples;

1 ≠ 00000001

10 ≠ 00000010

0010 ≠ 00000010

10001 ≠ 00010001

And so on...

Also one thing i want to point out again is that the bit size is always between 1 and 8 (inclusive) and is NOT a fixed number. I'm using C for this problem.

What is the actual and underlying problem that you need to solve by "storing" bits this way? Perhaps you could be storing the bits as *strings* instead? — Some programmer dude, Feb 03 '20 at 12:50
I have considered that too but for the purpose of my project it seems irrelevant to the underlying problem. — lag, Feb 03 '20 at 12:52
Then you'll simply have to deal with length information somewhere, and probably use bitwise operations for computing the data itself. "compression" is just too vague a problem to give any good answer. — Nelfeal, Feb 03 '20 at 12:56
You could use two bytes: one byte to say how many bits (r-t-l) are in use, and one byte to store them. — Paul Ogilvie, Feb 03 '20 at 12:59
yes but the goal is to compress the data rather than expanding it — lag, Feb 03 '20 at 13:01
One possible way is to have a table containing bit positions and length, and then just one large "string" (here I mean string of bits, not null-terminated byte string) for all the bits. If you're using "standard" compression algorithms (e.g. Huffman trees, Lempel-Ziv variants, etc.) then it's a long solved problem, and I suggest you read more about these algorithms as there's often information on how to encode it for on-disk storage (which I assume is your real problem here). — Some programmer dude, Feb 03 '20 at 13:02
Then you must store your compressed data as a string of bits (i.e. not as bytes). The decompressor should read/decompress that back. — Paul Ogilvie, Feb 03 '20 at 13:03
Have a look at existing implementations of compression algorithms. Some of them will probably show you how to do it. — klutt, Feb 03 '20 at 13:08
@Someprogrammerdude it's different than those two you mentioned above. — lag, Feb 03 '20 at 13:09
Well read about them anyway, they might have some hints that you could use. :) — Some programmer dude, Feb 03 '20 at 13:11
@Someprogrammerdude Any good source you might recommend, i mean a book. — lag, Feb 03 '20 at 13:13
Thanks everyone, the question is open for more discussion. I hope for even better answers. — lag, Feb 03 '20 at 13:16
Not really my area of expertise, and I don't have read any books about compresson. But some time ago and just by reading [the Huffman coding article on Wikipedia](https://en.wikipedia.org/wiki/Huffman_coding) I managed to get a simple encode and decoder working, storing the tree on disk and loading it again. — Some programmer dude, Feb 03 '20 at 13:17

score 1 · Answer 1 · answered Feb 03 '20 at 13:22

So you want to store bits in memory and read them back without knowing how long they are. This is not possible. (It's not possible with bytes either)

Imagine if you could do this. Then we could compress a file by, for example, saying that "0" compresses to "0" and "1" compresses to "00". After this "compression" (which would actually make the file bigger) we have a file with only 0's in it. Then, we compress the file with only 0's in it by writing down how many 0's there are. Amazing! Any 2GB file compresses to only 4 bytes. But we know it's impossible to compress every 2GB file into 4 bytes. So something is wrong with this idea.

You can read several bits from memory but you need to know how many you are reading. You can also do it if you don't know how many bits you are reading, but the combinations don't "overlap". So if "01" is a valid combination, then you can't have "010" because that would overlap "01". But you could have "001". This is called a prefix code and it is used in Huffman coding, a type of compression.

Of course, you could also save the length before each number. So you could save "0" as "0010" where the "001" means how many bits long the number is. With 3-digit lengths, you could only have up to 7-bit numbers. Or 8-bit numbers if you subtract 1 from the length, in which case you can't have zero-bit numbers. (so "0" becomes "0000", "101" becomes "010101", etc)

score 1 · Answer 2 · answered Aug 03 '20 at 18:55

You can control bits using bit shift operators or bit fields

Make sure you understand the endianess concept, that is machine dependent. and keep in mind that bit fields needs a struct, and a struct uses a minimum of 4 bytes.

And bit fields can be very tricky.

Good luck!

score 0 · Answer 3 · answered Feb 04 '20 at 07:00

If you just need to make sure a given binary number is evaluated properly, then you have two choices I can think of. You could store all of the amount of bits of each numbers alongside with the given number, which wouldn't be so efficient.

But you could also store all the binary numbers as being 8-bit, then when processing each individual number, pass through all of its digits to find its length. That way you just store the lenght of a single number at a time.

Here is some quick code, hopefully it's clear:

Uint8 rightNumber = 2; //Which is 10 in binary, or 00000010 
int rightLength = 2; //Since it is 2 bits long

Uint8 bn = mySuperbBinaryValueIWantToTest;
int i;
for(i = 7; i > 0; i--)
{
if((bn & (1 << i)) != 0)break;
}
int length = i + 1;
if(bn == rightNumber && length == rightLength) printf("Correct number");
else printf("Incorrect number");

Keep in mind you can also use the same technique to calculate the amount of bits inside the right value instead of precomputing it. If it's to arbitrary values you are comparing, the same can also work. Hope this helped, if not, feel free to criticize/re-explain your problem

seems interesting, though, I will appreciate if could provide further explanation, more in context to my question. will the code above evaluate 0010 different than 10. — lag, Feb 04 '20 at 12:24
@Ninja47 To be precise, it will calculate the lenght of 0010 as being 2, since the number actually just needs two digits(10). So unfortunately, if you want to test 10 not to be equal to 0010, then you'd have to manually compute the length of 0010 as being 4-bits. It might not work for every occasion. Your two other options would be storing the bits as strings or having other variables to keep the size of your binary numbers, the second being the more compact. — Le mouton vert, Feb 04 '20 at 20:34

Storing individual bits in memory

3 Answers3