3

The "magic" method of computing the inverse-square root, dating back to the Quake game apparently, is described in many sources. Wikipedia has a nice article on it: https://en.wikipedia.org/wiki/Fast_inverse_square_root

I particularly found the following to be a very nice write-up and analysis of the algorithm: https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf

I'm trying to replicate some of these results in this paper, but having an issue with the accuracy. The algorithm, coded in C, is given as follows:

#include <math.h>
#include <stdio.h>

float Q_rsqrt(float number) {
  long i;
  float x2, y;
  const float threehalfs = 1.5F;

  x2 = number * 0.5F;
  y = number;
  i = *(long *) &y;
  i = 0x5f3759df - (i >> 1);
  y = *(float *) &i;
  y = y * (threehalfs - (x2 * y * y));
  // y = y * (threehalfs - (x2 * y * y));
  return y;
}

The paper states that the relative error is at most 0.0017522874 for all positive normal floats. (See Appendix 2 for the code, and the discussion in Section 1.4.)

However, when I "plug-in" the number 1.4569335e-2F, the error I get is larger than this predicted tolerance:

int main ()
{

  float f = 1.4569335e-2F;

  double tolerance = 0.0017522874;
  double actual    = 1.0 / sqrt(f);
  float  magic     = Q_rsqrt(f);
  double err       = fabs (sqrt(f) * (double) magic - 1);

  printf("Input    : %a\n", f);
  printf("Actual   : %a\n", actual);
  printf("Magic    : %a\n", magic);
  printf("Err      : %a\n", err);
  printf("Tolerance: %a\n", tolerance);
  printf("Passes   : %d\n", err <= tolerance);

  return 0;
}

The output is:

Input    : 0x1.dd687p-7
Actual   : 0x1.091cc953ea828p+3
Magic    : 0x1.08a5dcp+3
Err      : 0x1.cb5b716b7b6p-10
Tolerance: 0x1.cb5a044e0581p-10
Passes   : 0

So, this particular input seems to violate the claim made in that paper.

I'm wondering if this is an issue with the paper itself, or if I made a mistake in my coding. I'd appreciate any feedback!

alias
  • 28,120
  • 2
  • 23
  • 40
  • 1
    Looking at the [paper](https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf), appendix A.2 , the maximum error has been computed by sampling. Hence, I would not be too worried to find an error slightly larger than the maximum error... – francis Jan 04 '17 at 07:52
  • @francis Not really. The program tests every 32-bit value from 0x00800000 to 0x7f7fffff, which covers almost the entire range of positive float values. – r3mainer Jan 04 '17 at 09:50
  • I think @francis is right; the sampling is the issue here. The stated relative error applies to the original magic number used in Quake, which is what I was using as well. (i.e., the question is really about the original magic number and the associated relative error; not the improvements over it.) – alias Jan 04 '17 at 18:38

2 Answers2

4

You're using the wrong magic number.

0x5f3759df is the value originally used in Quake III, but it was later found that 0x5f375a86 gives better results. If you take a look at Fig. 6.1 on page 40 of the paper you cited, you'll see that it's using the improved constant.

Here are the results I obtained using 0x5f375a86:

Input    : 0x1.dd687p-7
Actual   : 0x1.091cc953ea828p+3
Magic    : 0x1.08a5fap+3
Err      : 0x1.cae79153f2cp-10
Tolerance: 0x1.cb5a044e0581p-10
Passes   : 1
r3mainer
  • 23,981
  • 3
  • 51
  • 88
  • indeed, you're right. This original value of Quake III can be found in the [source of Quake III, in file q_math.c](https://github.com/id-Software/Quake-III-Arena/blob/master/code/game/q_math.c) on line 561. The comment on that very line is not intended to be included on a Q/A site... The history of the constant and the optimal value for 64 bit IEEE754 double (0x5fe6eb50c7b537a9) can be found on [wikipedia], citing [the work of Matthew Robertson](https://cs.uwaterloo.ca/~m32rober/rsqrt.pdf), who also reports the values for double and quad precision! – francis Jan 04 '17 at 18:09
  • I'm not sure if I agree. If you look at Section 4.7 of the paper, the improved magic number of 0x5f375a86 has a maximum relative error of 0.0017512378; which is *different* than the one for the original magic number. So I think my question stands regarding the relative error of the original `Q_rsqrt` function. – alias Jan 04 '17 at 18:36
  • @LeventErkok I see what you mean. By the look of things, the author uses `Q_rsqrt()` to refer to the original Quake III code, and `rsqrt()` to refer to the improved version. Neither of these functions yields the result cited in section 4.7. Perhaps you could try contacting the author for clarification. – r3mainer Jan 04 '17 at 22:02
  • 1
    To test the `Q_rsqrt()` function on my 64 bit computer, I felt it would be better to use `unsigned int` instead of `long`. I found that the largest error of `Q_rsqrt()``is 0x1.cb5d752717ep-10=0.001752338672. It is obtained for the float 0x1.dd678p-49. As noticed by @LeventErkok, it's higher than the cited bound... – francis Jan 04 '17 at 23:18
  • I tried locating the author, but couldn't find an e-mail address for him. Perhaps he'll read this one day! @francis Thanks for the further analysis. If you can turn that comment into an answer, I'd be happy to accept it! – alias Jan 05 '17 at 05:52
  • @LeventErkok I'm not sure but I think it's this guy: https://www.linkedin.com/in/mattcharobertson – Giulio Pulina Oct 28 '18 at 23:22
  • @GiulioPulina Thanks! I lost interest in delving further, but if anyone else wants to the get to the bottom of this your link might prove fruitful! – alias Oct 29 '18 at 15:41
1

Let's try a little piece of code to recompute the bound on the relative error and shows that it's slightly larger than the one in the thesis of Matthew Robertson. Indeed, as noticed first in the answer of @squeamishossifrage and noted in the thesis of Matthew Robertson, this implementation is the one that was disclosed in the source of Quake III. In particular, the original value of the constant of Quake III can be found in the source of Quake III, in file q_math.c on line 561.

First, the code needs to be adapted to work on 64bit plateforms. The only thing that may have to be modified is the integer type: long is not plateform-independent. On my linux computer, sizeof(long) returns 8... As updated in the paper on page 49, the type uint32_t will ensure that the type of the integer is of the same size as a float.

Here goes the code, to be compiled by gcc main.c -o main -lm -Wall and ran by ./main:

#include <math.h>
#include <stdio.h>
#include <inttypes.h>

float Q_rsqrt(float number) {
    uint32_t i;
    float x2, y;
    const float threehalfs = 1.5F;

    x2 = number * 0.5F;
    y = number;
    i = *(uint32_t *) &y;
    i = 0x5f3759df - (i >> 1); //  0x5f3759df 0x5f375a86
    y = *(float *) &i;
    y = y * (threehalfs - (x2 * y * y));
    // y = y * (threehalfs - (x2 * y * y));
    return y;
}

int main ()
{

    printf("%ld %ld\n",sizeof(long),sizeof(uint32_t));

    uint32_t i;
    float y;
    double e, max = 0.0;
    float maxval=0;
    for(i = 0x0000000; i < 0x6f800000; i++) {
        y = *(float *) &i;
        if(y>1e-30){
            e = fabs(sqrt((double)y)*(double)Q_rsqrt(y) - 1);
            if(e > max){
                max = e;
                maxval=y;
            }
        }
    }
    printf("On value %2.8g == %a\n", maxval, maxval);
    printf("The bound is %2.12g == %a\n", max, max);

    return 0;
}

For the bound, I obtained 0.0017523386721 == 0x1.cb5d752717ep-10. As you noticed, it is slightly larger than the one reported in the paper (0.001752287). Evaluating the error using float instead of double does not change much the outcome.

francis
  • 9,525
  • 2
  • 25
  • 41