Power of 2 approximation in fixed point

Question

Currently, I am using a small lookup table and linear interpolation which is quite fast and also accurate enough (max error is less than 0.001). However I was wondering if there is an approximation which is even faster.

Since the integer part of the exponent can be extracted and calculated by bitshifts, the approximation just needs to work in the range [-1,1] I have tried to find a chebyshev polynomial, but could not achieve a good accuracy for polynomials of low order. I could live with a max error around 0.01 I guess, but I did not get near that number. Higher order polynomials are not an option, since they are much less efficient than my current lookup table based solution.

maybe more suited for math.stackexchange.com, wolfram alpha gives some serie representations: http://www.wolframalpha.com/input/?i=2%5Ex — , Apr 11 '16 at 13:42
Thanks RC. Unfortunately low order Taylor series are far from accurate and higher order series are much slower than a lookup table based solution. — maniacmic, Apr 11 '16 at 14:10
Decompose it to 2^0.5, 2^0.25, .....and conditionally multiply them together? — user3528438, Apr 11 '16 at 14:46
I'm voting to close this question as off-topic because it's a question about mathematics rather than the C programming language. It would be better suited for the Math Stack Exchange. — Maximillian Laumeister, Apr 12 '16 at 00:30
@user3528438: Good idea, but I believe a lookup is still faster. There are too many steps involved to check every bit and multiply. Also I would still need to keep a table for the values 2^0.5, 2^0.25, ... — maniacmic, Apr 12 '16 at 07:02
@MaximillianLaumeister: I agree with you, that the question is not directly related to C. I am going to remove that tag. I posted this question on Stackoverflow, because I was hoping, that programmers are more likely to know optimizations for fixed point math than mathematicians. I also did not find any fixed point related question on Math Stack Exchange, but if I am not getting any answers here, I will close the question and try my luck there. — maniacmic, Apr 12 '16 at 08:57
@emzed It might also receive more attention on the Computer Science stack exchange. — Maximillian Laumeister, Apr 12 '16 at 17:09

njuffa · Accepted Answer · 2016-04-18T15:28:24.760

Since no specific fixed-point format was stated, I will demonstrate a possible alternative to table lookup using s15.16 fixed-point arithmetic, which is fairly commonly used. The basic idea is to split the input a into an integral portion i and a fractional portion f, such that f in [-0.5,0.5], then use a minimax polynomial approximation for exp2(f) on [-0.5, 0.5] and perform final scaling based on i.

Minimax approximations can be generated with tools such as Mathematica, Maple, or Sollya. If none of these tools are available, one could use a custom implementation of the Remez algorithm to generate minimax aproximations.

The Horner scheme should be used to evaluate the polynomial. Since fixed-point arithmetic is used, the evaluation of the polynomial should scale operands to the maximum extent possible (i.e. without overflow) in intermediate steps to optimized the accuracy of the computation.

The C code below assumes that right shifts applied to signed integer data types result in arithmetic shift operations, and therefore negative operands are shifted appropriately. This is not guaranteed by the ISO C standard, but in my experience it will work fine with various tool chains. In the worst case, inline assembly could be used to force generation of the desired arithmetic right shift instructions.

The output of the test included with the fixed_exp2() implementation below should look as follows:

testing fixed_exp2 with inputs in [-5.96484, 15)
max. rel. err = 0.000999758

This demonstrates that the desired error bound of 0.001 is met for inputs in the interval [-5.96484, 15).

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <math.h>

/* compute exp2(a) in s15.16 fixed-point arithmetic, -16 < a < 15 */
int32_t fixed_exp2 (int32_t a)
{
    int32_t i, f, r, s;
    /* split a = i + f, such that f in [-0.5, 0.5] */
    i = (a + 0x8000) & ~0xffff; // 0.5
    f = a - i;   
    s = ((15 << 16) - i) >> 16;
    /* minimax approximation for exp2(f) on [-0.5, 0.5] */
    r = 0x00000e20;                 // 5.5171669058037949e-2
    r = (r * f + 0x3e1cc333) >> 17; // 2.4261112219321804e-1
    r = (r * f + 0x58bd46a6) >> 16; // 6.9326098546062365e-1
    r = r * f + 0x7ffde4a3;         // 9.9992807353939517e-1
    return (uint32_t)r >> s;
}

double fixed_to_float (int32_t a)
{
    return a / 65536.0;
}

int main (void)
{
    double a, res, ref, err, maxerr = 0.0;
    int32_t x, start, end;

    start = 0xfffa0900;
    end = 0x000f0000;
    printf ("testing fixed_exp2 with inputs in [%g, %g)\n",  
            fixed_to_float (start), fixed_to_float (end));

    for (x = start; x < end; x++) {
        a = fixed_to_float (x);
        ref = exp2 (a);
        res = fixed_to_float (fixed_exp2 (x));
        err = fabs (res - ref) / ref;
        if (err > maxerr) {
            maxerr = err;
        }
    }
    printf ("max. rel. err = %g\n", maxerr);
    return EXIT_SUCCESS;
}

Thanks njuffa! That looks like a fast fixed point exp2 approximation. I have found a formula which uses a simple quadratic approximation, which requires a similar amount of instructions, but is not as accurate as your proposed solution. I am going to try yours and compare for speed as soon as I find the time to do it. I suppose, my initial tests with polynomial approximation did not succeed, because my range [-1,1] was to large. — maniacmic, Apr 18 '16 at 09:00
@thomachan A shift count. Final scaling is based on `i`. Since this is fixed-point arithmetic, scaling by 2**`i` is equivalent to a shift by `s` which is derived directly from `i`. — njuffa, Nov 23 '18 at 10:18
@njuffa - How can I modify the above code for variable Q format? I have s5.26 fixed point format input and range of values is [-31.9, 31.9]. — thomachan, Dec 11 '18 at 05:31
@thomachan That requires more than trivial changes. Most importantly a new core approximation would be required to achieve the required (higher) accuracy. You could use the [Sollya tool](http://sollya.gforge.inria.fr/) to generate the approximation. — njuffa, Dec 11 '18 at 06:05
@thomachan I have working code for an s5.26 implementation of `exp2()` now. Largest absolute error is 1.55354e-7 for the following input plus result `x=11fff9ca fixed_exp2=5a821826 ref=5a821830`. If you can ask a new question that is not a duplicate of this one, I can post an answer with that code. — njuffa, Dec 11 '18 at 10:51
@njuffa https://stackoverflow.com/questions/53736820/fixed-point-approximation-of-2x-with-input-range-of-s5-26 — thomachan, Dec 12 '18 at 08:33

Power of 2 approximation in fixed point

1 Answers1

Linked