19

Edit: In the initial question had a wrong formula and the algorithm tried was doing something completely different than what was intended. I apologise and I decided to rewrite the question to eliminate all the confusion.

I need to compute at compile time (the result will be used as a non-type template parameter) the minimum number of bits needed to store n different states:

constexpr unsigned bitsNeeded(unsigned n);

or via template

The results should be:

+-----------+--------+
| number of | bits   |
| states    | needed |
+-----------+--------+
|     0     |    0   | * or not defined
|           |        |
|     1     |    0   |
|           |        |
|     2     |    1   |
|           |        |
|     3     |    2   |
|     4     |    2   |
|           |        |
|     5     |    3   |
|     6     |    3   |
|     7     |    3   |
|     8     |    3   |
|           |        |
|     9     |    4   |
|    10     |    4   |
|    ..     |   ..   |
|    16     |    4   |
|           |        |
|    17     |    5   |
|    18     |    5   |
|    ..     |   ..   |
+-----------+--------+

The initial (somehow corrected) version for reference:

I need to compute at compile time (the result will be used as a non-type template parameter) the minimum number of bits needed to store n different states i.e. the integral part (rounded down) rounded up of binary logarithm:

ceil(log2(n))

constexpr unsigned ceilLog2(unsigned n);

This is what I came up with (completely wrong):

constexpr unsigned intLog2(unsigned num_states_) {
  return
    num_states_ == 1 ?
      1 :
      (
      intLog2(num_states_ - 1) * intLog2(num_states_ - 1) == num_states_ - 1 ?
        intLog2(num_states_ - 1) + 1 :
        intLog2(num_states_ - 1)
      );
}

This produces the correct result (for num_states_ != 0), but the recursion blows out exponentially and it is practically unusable for any input greater than 10 (the memory usage during compilation grows beyond 2GB, the OS freezes and the compiler crashes).

How can I compute this at compile time in a practical manner?

bolov
  • 72,283
  • 15
  • 145
  • 224

7 Answers7

13

The minimum number of bits required to store n different states is ceil(log2(n)).

constexpr unsigned floorlog2(unsigned x)
{
    return x == 1 ? 0 : 1+floorlog2(x >> 1);
}

constexpr unsigned ceillog2(unsigned x)
{
    return x == 1 ? 0 : floorlog2(x - 1) + 1;
}

Note that ceillog2(1) == 0. This perfectly fine, because if you want to serialize an object, and you know that one of its data members can only take on the value 42, you don't need to store anything for this member. Just assign 42 on deserialization.

Henrik
  • 23,186
  • 6
  • 42
  • 92
  • @DanielKO looks like your compiler is broken. `ceillog2(2) == 1` http://ideone.com/6ztr7X – Henrik May 21 '14 at 12:48
  • my mistake, I pasted the code incorrectly; I keep forgetting that middle-button pasting is unreliable in web-based text editors. – DanielKO May 21 '14 at 12:51
10

Try this:

constexpr unsigned numberOfBits(unsigned x)
{
    return x < 2 ? x : 1+numberOfBits(x >> 1);
}

Simpler expression, correct result.

EDIT: "correct result" as in "the proposed algorithm doesn't even come close"; of course, I'm computing the "number of bits to represent value x"; subtract 1 from the argument if you want to know how many bits to count from 0 to x-1. To represent 1024 you need 11 bits, to count from 0 to 1023 (1024 states) you need 10.

EDIT 2: renamed the function to avoid confusion.

DanielKO
  • 4,422
  • 19
  • 29
  • Correct result? Are you sure? – Konrad Rudolph May 21 '14 at 12:24
  • 1
    yes, my algorithm was doing something completely different. I am sorry about that. What I am searching is the number of bits needed to store n different values, e.g. for n = 4 there are 2 bits needed. Your solution is very useful, thank you, easy to adapt to what I want. Again sorry for the very bad question. – bolov May 21 '14 at 12:41
  • again sorry for the confusion in the question. Your function is wrong when `x` is a power of 2. For instance for 8 states only 3 bits are needed. Your answer gives 4. – bolov May 21 '14 at 13:42
  • @bolov either you didn't read the ×edit×, or you didn't try to write 8 in binary (1000, that's 4 bits). – DanielKO May 21 '14 at 16:11
  • Suggest differing to @bolov's answer and deleting this one, it's too focused on the discussion... – einpoklum Jul 17 '15 at 13:55
5

Due to the confusion caused by the initial question I chose to post this answer. This is built upon the answers of @DanielKO and @Henrik.

The minimum number of bits needed to encode n different states:

constexpr unsigned bitsNeeded(unsigned n) {
  return n <= 1 ? 0 : 1 + bitsNeeded((n + 1) / 2);
}
bolov
  • 72,283
  • 15
  • 145
  • 224
4

In C++20, we have (in the header <bit>):

template<class T>
  constexpr T log2p1(T x) noexcept;

Returns: If x == 0, 0; otherwise one plus the base-2 logarithm of x, with any fractional part discarded. Remarks: This function shall not participate in overload resolution unless T is an unsigned integer type.

Marshall Clow
  • 15,972
  • 2
  • 29
  • 45
2

maybe

constexpr int mylog(int n) {
    return (n<2) ?1:
           (n<4) ?2:
           (n<8) ?3:
           (n<16)?4:
           (n<32)?5:
           (n<64)?6:
           …
           ;
}

as you will use it as tempalte parameter you might want to check out what boost has to offer

Valerij
  • 27,090
  • 1
  • 26
  • 42
1

constexpr is a bit underpowered and will be until C++14. I recommend templates:

template<unsigned n> struct IntLog2;
template<> struct IntLog2<1> { enum { value = 1 }; };

template<unsigned n> struct IntLog2 {
private:
  typedef IntLog2<n - 1> p;
public:
  enum { value = p::value * p::value == n - 1 ? p::value + 1 : p::value };
};
Jon Purdy
  • 53,300
  • 8
  • 96
  • 166
  • **Why** do you suggest templates? The same is trivially possible with `constexpr`. – Konrad Rudolph May 21 '14 at 12:32
  • @KonradRudolph: My hunch was that since templates have been used this way for longer than `constexpr`, the compiler might evaluate this 1:1 translation more quickly. I should have thought to change the algorithm, though, and I can understand if your objection is aesthetic. – Jon Purdy May 21 '14 at 15:01
1

Something I've used in my own code:

static inline constexpr
uint_fast8_t log2ceil (uint32_t value)
/* Computes the ceiling of log_2(value) */
{
    if (value >= 2)
    {
        uint32_t mask = 0x80000000;
        uint_fast8_t result = 32;
        value = value - 1;

        while (mask != 0) {
            if (value & mask)
                return result;
            mask >>= 1;
            --result;
        }
    }
    return 0;
}

It requires C++14 to be used as constexpr, but it has the nice property that it's reasonably fast at run time—about an order of magnitude faster than using std::log and std::ceil—and I've verified that it produces the same results for all representable non-zero values (log is undefined on zero, though 0 is a reasonable result for this application; you don't need any bits to distinguish zero values) using the following program:

#include <iostream>
#include <cstdlib>
#include <cstdint>
#include <cmath>
#include "log2ceil.hh"

using namespace std;

int main ()
{
    for (uint32_t i = 1; i; ++i)
    {
        // If auto is used, stupid things happen if std::uint_fast8_t
        // is a typedef for unsigned char
        int l2c_math = ceil (log (i) / log (2));
        int l2c_mine = log2ceil (i);
        if (l2c_mine != l2c_math)
        {
            cerr << "Incorrect result for " << i << ": cmath gives "
                 << l2c_math << "; mine gives " << l2c_mine << endl;
            return EXIT_FAILURE;
        }
    }

    cout << "All results are as correct as those given by ceil/log." << endl;
    return EXIT_SUCCESS;
}

This shouldn't be too hard to generalize to different argument widths, either.

Stuart Olsen
  • 476
  • 2
  • 7