2

This refers to the comment against the post - Sorting 10 million integers with 1 MB space Solution explanation - Programming Pearls

'You can use 4 bits per counter for this, not 2 bytes. If you group counters you can even lower this value, for example if you group 3 counters, that's 10*10*10=1000 combinations and you need 10 bits (=1024 values) for that.'

Since I am not from computer science background,would appreciate help in explaining the technique behind tracking counters in 1/2 a byte.

Community
  • 1
  • 1
IUnknown
  • 9,301
  • 15
  • 50
  • 76
  • 2
    `If each integer appears at most ten times, then we can count its occurence in a four-bit half byte.` The number 10 (ten) can be encoded using 4 bits, hence a half-byte. But 10 requires only `log2(10) ~= 3.32` bits to be represented, so you can pack several counters together to save even more space (as in the quote). – user703016 Feb 25 '15 at 03:28
  • Perfect explanation - thanks – IUnknown Feb 25 '15 at 03:51

1 Answers1

0

The idea is that you use bit manipulation to exploit the fact that you have more bits at your disposal than are needed to represent the individual values you are storing. This is most practical when you are using a programming language that intrinsically supports bit manipulation, such a C++.

For a given integer, it requires log2(num) bits to represent the value in binary. As a corollary, you can determine the largest integer that can be stored in a given number of bits with 2^n-1 where n is the number of bits. So if we limit ourselves to 4 bits, the biggest number we can represent is 15.

For simplicity's sake, let's assume that the number of bits available in each value is 8 (1 byte). And as you requested, we will use 4 bits for each stored value. Also, the code that follows will be in C++.

Helper Constants

These constants will help in the computations.

 const int PIECE_MASK_LOWER = 15  //All 4 lower bits set to 1
 //NOTE: 15 = 2^0 + 2^1 + 2^2 + 2^3
 const int VALUE_SIZE = 4 //We're using 4 bits for each value

Retrieving

Use bitwise AND logic to extract only the bits of the value you want. Bitwise AND logic gives a value whose binary representation contains 1s where both of the operands had 1s.

Retrieving the lower piece

val = compositeVal & PIECE_MASK_LOWER

Retrieving the upper piece

This is the same as getting the lower piece; but you need to right-shift the bits first so that the bits of the target value are in the correct position for the bitwise AND operation.

val = (compositeVal >> VALUE_SIZE) & PIECE_MASK_LOWER

Storing

Use bitwise OR logic to apply the bits of the value you want to store to the byte in which you are storing it. Bitwise OR logic gives a result whose binary representation contains 1s where either of the operands had 1s.

Storing the lower piece

compositeVal = compositeVal | val

Storing the upper piece

This is the same as storing the lower piece, except you need to left-shift the bits so that they are in position for the bitwise OR operation.

compositeVal = compositeVal | (val << VALUE_SIZE)

Note: if you are overwriting a piece in the composite value, you will have to clear the value that's already there. Otherwise the bitwise OR operation will give you corrupted values.

Sildoreth
  • 1,883
  • 1
  • 25
  • 38