Compile time computing of number of bits needed to encode n different states

Question

Edit: In the initial question had a wrong formula and the algorithm tried was doing something completely different than what was intended. I apologise and I decided to rewrite the question to eliminate all the confusion.

I need to compute at compile time (the result will be used as a non-type template parameter) the minimum number of bits needed to store n different states:

constexpr unsigned bitsNeeded(unsigned n);

or via template

The results should be:

+-----------+--------+
| number of | bits   |
| states    | needed |
+-----------+--------+
|     0     |    0   | * or not defined
|           |        |
|     1     |    0   |
|           |        |
|     2     |    1   |
|           |        |
|     3     |    2   |
|     4     |    2   |
|           |        |
|     5     |    3   |
|     6     |    3   |
|     7     |    3   |
|     8     |    3   |
|           |        |
|     9     |    4   |
|    10     |    4   |
|    ..     |   ..   |
|    16     |    4   |
|           |        |
|    17     |    5   |
|    18     |    5   |
|    ..     |   ..   |
+-----------+--------+

The initial (somehow corrected) version for reference:

I need to compute at compile time (the result will be used as a non-type template parameter) the minimum number of bits needed to store n different states i.e. the integral part ~~(rounded down)~~ rounded up of binary logarithm:

ceil(log2(n))

constexpr unsigned ceilLog2(unsigned n);

This is what I came up with (completely wrong):

constexpr unsigned intLog2(unsigned num_states_) {
  return
    num_states_ == 1 ?
      1 :
      (
      intLog2(num_states_ - 1) * intLog2(num_states_ - 1) == num_states_ - 1 ?
        intLog2(num_states_ - 1) + 1 :
        intLog2(num_states_ - 1)
      );
}

~~This produces the correct result~~ (for num_states_ != 0), but the recursion blows out exponentially and it is practically unusable for any input greater than 10 (the memory usage during compilation grows beyond 2GB, the OS freezes and the compiler crashes).

How can I compute this at compile time in a practical manner?

@Henrik it doesn't. `intLog2Runtime(256) = 16`, `intLog2Runtime(1024) = 32`. — DanielKO, May 21 '14 at 11:40
@DanielKO [GCC 4.8 complains for 1024](http://coliru.stacked-crooked.com/a/0517599d95d14504)... — rubenvb, May 21 '14 at 11:41
Seems you're trying to compute the square root, not the logarithm. — Henrik, May 21 '14 at 12:09
The minimum number of bits would be the binary logarithm rounded up (not down). — Henrik, May 21 '14 at 12:12
@Henrik yes, I messed both the formula and the algorithm. Sorry. — bolov, May 21 '14 at 12:32

Henrik · Answer 1 · 2014-05-21T14:00:59.053

13

The minimum number of bits required to store n different states is ceil(log2(n)).

constexpr unsigned floorlog2(unsigned x)
{
    return x == 1 ? 0 : 1+floorlog2(x >> 1);
}

constexpr unsigned ceillog2(unsigned x)
{
    return x == 1 ? 0 : floorlog2(x - 1) + 1;
}

Note that ceillog2(1) == 0. This perfectly fine, because if you want to serialize an object, and you know that one of its data members can only take on the value 42, you don't need to store anything for this member. Just assign 42 on deserialization.

edited May 21 '14 at 14:00

answered May 21 '14 at 12:17

Henrik

23,186
6
42
92

@DanielKO looks like your compiler is broken. `ceillog2(2) == 1` http://ideone.com/6ztr7X – Henrik May 21 '14 at 12:48
my mistake, I pasted the code incorrectly; I keep forgetting that middle-button pasting is unreliable in web-based text editors. – DanielKO May 21 '14 at 12:51

DanielKO · Answer 2 · 2014-05-21T12:53:45.843

10

Try this:

constexpr unsigned numberOfBits(unsigned x)
{
    return x < 2 ? x : 1+numberOfBits(x >> 1);
}

Simpler expression, correct result.

EDIT: "correct result" as in "the proposed algorithm doesn't even come close"; of course, I'm computing the "number of bits to represent value x"; subtract 1 from the argument if you want to know how many bits to count from 0 to x-1. To represent 1024 you need 11 bits, to count from 0 to 1023 (1024 states) you need 10.

EDIT 2: renamed the function to avoid confusion.

edited May 21 '14 at 12:53

answered May 21 '14 at 11:48

DanielKO

4,422
19
29

Correct result? Are you sure? – Konrad Rudolph May 21 '14 at 12:24
1

yes, my algorithm was doing something completely different. I am sorry about that. What I am searching is the number of bits needed to store n different values, e.g. for n = 4 there are 2 bits needed. Your solution is very useful, thank you, easy to adapt to what I want. Again sorry for the very bad question. – bolov May 21 '14 at 12:41
again sorry for the confusion in the question. Your function is wrong when `x` is a power of 2. For instance for 8 states only 3 bits are needed. Your answer gives 4. – bolov May 21 '14 at 13:42
@bolov either you didn't read the ×edit×, or you didn't try to write 8 in binary (1000, that's 4 bits). – DanielKO May 21 '14 at 16:11
Suggest differing to @bolov's answer and deleting this one, it's too focused on the discussion... – einpoklum Jul 17 '15 at 13:55

bolov · Accepted Answer · 2014-05-21T13:49:59.790

5

Due to the confusion caused by the initial question I chose to post this answer. This is built upon the answers of @DanielKO and @Henrik.

The minimum number of bits needed to encode n different states:

constexpr unsigned bitsNeeded(unsigned n) {
  return n <= 1 ? 0 : 1 + bitsNeeded((n + 1) / 2);
}

edited May 21 '14 at 13:49

answered May 21 '14 at 13:39

bolov

72,283
15
145
224

This overflows for `n == UINT_MAX` which @DanielKO’s doesn’t. – mirabilos Jan 02 '22 at 22:03

score 4 · Answer 4 · answered Aug 19 '19 at 15:35

4

In C++20, we have (in the header <bit>):

template<class T>
  constexpr T log2p1(T x) noexcept;

Returns: If x == 0, 0; otherwise one plus the base-2 logarithm of x, with any fractional part discarded. Remarks: This function shall not participate in overload resolution unless T is an unsigned integer type.

answered Aug 19 '19 at 15:35

Marshall Clow

15,972
2
29
45

4

I think it's called `std::bit_width()` [officially](https://en.cppreference.com/w/cpp/numeric/bit_width) – robert4 May 10 '21 at 02:03
It is *now*; but it wasn't when I wrote that answer. – Marshall Clow May 10 '21 at 03:30

Valerij · Answer 5 · 2014-05-21T11:40:03.677

2

maybe

constexpr int mylog(int n) {
    return (n<2) ?1:
           (n<4) ?2:
           (n<8) ?3:
           (n<16)?4:
           (n<32)?5:
           (n<64)?6:
           …
           ;
}

as you will use it as tempalte parameter you might want to check out what boost has to offer

edited May 21 '14 at 11:40

answered May 21 '14 at 11:28

Valerij

27,090
1
26
42

yeah, I thought about this, I just hopped it wouldn't have to come to this. – bolov May 21 '14 at 11:35

score 1 · Answer 6 · answered May 21 '14 at 11:53

1

constexpr is a bit underpowered and will be until C++14. I recommend templates:

template<unsigned n> struct IntLog2;
template<> struct IntLog2<1> { enum { value = 1 }; };

template<unsigned n> struct IntLog2 {
private:
  typedef IntLog2<n - 1> p;
public:
  enum { value = p::value * p::value == n - 1 ? p::value + 1 : p::value };
};

answered May 21 '14 at 11:53

Jon Purdy

53,300
8
96
166

**Why** do you suggest templates? The same is trivially possible with `constexpr`. – Konrad Rudolph May 21 '14 at 12:32
@KonradRudolph: My hunch was that since templates have been used this way for longer than `constexpr`, the compiler might evaluate this 1:1 translation more quickly. I should have thought to change the algorithm, though, and I can understand if your objection is aesthetic. – Jon Purdy May 21 '14 at 15:01

score 1 · Answer 7 · answered May 21 '14 at 23:12

Something I've used in my own code:

static inline constexpr
uint_fast8_t log2ceil (uint32_t value)
/* Computes the ceiling of log_2(value) */
{
    if (value >= 2)
    {
        uint32_t mask = 0x80000000;
        uint_fast8_t result = 32;
        value = value - 1;

        while (mask != 0) {
            if (value & mask)
                return result;
            mask >>= 1;
            --result;
        }
    }
    return 0;
}

It requires C++14 to be used as constexpr, but it has the nice property that it's reasonably fast at run time—about an order of magnitude faster than using std::log and std::ceil—and I've verified that it produces the same results for all representable non-zero values (log is undefined on zero, though 0 is a reasonable result for this application; you don't need any bits to distinguish zero values) using the following program:

#include <iostream>
#include <cstdlib>
#include <cstdint>
#include <cmath>
#include "log2ceil.hh"

using namespace std;

int main ()
{
    for (uint32_t i = 1; i; ++i)
    {
        // If auto is used, stupid things happen if std::uint_fast8_t
        // is a typedef for unsigned char
        int l2c_math = ceil (log (i) / log (2));
        int l2c_mine = log2ceil (i);
        if (l2c_mine != l2c_math)
        {
            cerr << "Incorrect result for " << i << ": cmath gives "
                 << l2c_math << "; mine gives " << l2c_mine << endl;
            return EXIT_FAILURE;
        }
    }

    cout << "All results are as correct as those given by ceil/log." << endl;
    return EXIT_SUCCESS;
}

This shouldn't be too hard to generalize to different argument widths, either.

Compile time computing of number of bits needed to encode n different states

7 Answers7