Different results in floating-point calculations on WebGL2 and C++

Question

I am trying to make calculations on the fragment shader in WebGL2. And I've noticed that the calculations there are not as precise as on C++. I know that the high precision float contains 32 bits either in the fragment shader or in C++.

I am trying to compute 1.0000001^(10000000) and get around 2.8 on C++ and around 3.2 on the shader. Do you know the reason that the fragment shader calculations are not as precise as the same calculations on C++?

code on C++

#include <iostream>
void main()
{
  const float NEAR_ONE = 1.0000001;
  float result = NEAR_ONE;

  for (int i = 0; i < 10000000; i++)
  {
    result = result * NEAR_ONE;
  }

  std::cout << result << std::endl; // result is 2.88419
}

Fragment shader code:

#version 300 es
precision highp float;
out vec4 color;
void main()
{
  const float NEAR_ONE = 1.0000001;
  float result = NEAR_ONE;

  for (int i = 0; i < 10000000; i++)
  {
    result = result * NEAR_ONE;
  }    

  if ((result > 3.2) && (result < 3.3))
  {
    // The screen is colored by red and this is how we know 
    // that the value of result is in between 3.2 and 3.3
    color = vec4(1.0, 0.0, 0.0, 1.0); // Red
  }
  else
  {
     // We never come here. 
     color = vec4(0.0, 0.0, 0.0, 1.0); // Black
  }
}

Update: Here one can find the html file with the full code for the WebGL2 example

Why don't you just use [e](https://en.wikipedia.org/wiki/E_(mathematical_constant)) directly instead of computing it in such a precision-dependent way? — Max Langhof, Dec 12 '19 at 09:19
Here is an artificial example to demostrate that the calculations are not precise. — David, Dec 12 '19 at 09:21
In that case the answer is most likely "rounding mode". If e.g. C++ always rounds to nearest and the shader code always to next highest the results will be quite different. — Max Langhof, Dec 12 '19 at 09:21
@MaxLanghof is it possible to change the "rounding mode" on the shader? — David, Dec 12 '19 at 09:22
You added the ieee-754 tag, but are you sure that your GPU hardware is compliant to that standard? — Bob__, Dec 12 '19 at 09:23
@Bob__ Yes I am sure. The chapter 4.5.1 https://www.khronos.org/registry/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00.pdf — David, Dec 12 '19 at 09:24
Actually, rounding mode alone doesn't explain it: https://godbolt.org/z/eXY_FP It does lead to different results, but none of them near 3.2. — Max Langhof, Dec 12 '19 at 09:26
@David Quoting from the chapter you linked: _"The rounding mode cannot be set and is undefined"_, and _"The rounding mode is not defined but must not affect the result by more than 1 ULP"_. — Max Langhof, Dec 12 '19 at 09:28
`float` is likely IEEE-754 single precision whose precision is about ~7 digits, so it can't store values such as 1.0000001 closely. The closest value to it is 1.0000001192092... — phuclv, Dec 12 '19 at 09:28
@MaxLanghof I think If the problem would be only in the rounding mode differences then we would have sometimes more precise calculations on C++ and sometimes on the shader. But all my experiments show that all the calculations on the shaders are not precise enough. For instance I've tried to generate the mandelbrot fractal and actually "see" the difference in results. https://computergraphics.stackexchange.com/questions/9403/why-there-are-calculation-differences-in-webgl-and-opengl — David, Dec 12 '19 at 09:31
@David If you suspect precision being the issue, try to repeatedly add EPSILON to one and see where the sum stops increasing. But if the shader had less than 32 bit floats, then `1.0000001` (which is equivalent to `1 + FLOAT_EPSILON`) would round down to `1.0` (or to much higher than `1.0000001` so you would end up far beyond `3.2`). — Max Langhof, Dec 12 '19 at 09:37
Following the last comment on the Q&A you linked, [this](https://stackoverflow.com/questions/4414041/what-is-the-precision-of-highp-floats-in-glsl-es-2-0-for-iphone-ipod-touch-ipad) may be related. In other words, do you know the exact precision of `highp float` in your environment? — Bob__, Dec 12 '19 at 09:37
@MaxLanghof I did that experiment yesterday. And both on C++ and on the shader it stops increasing on the same value in between 1.0000001 and 1.00000001. It proves that shader uses 32 bit float. — David, Dec 12 '19 at 09:41
@David `1.00000005` (7 zeros) is represented as `1.0`. As phuclv said, `1.0000001192092` (6 zeros) is the next 32 bit float after `1.0`. I contest that your sum stops increasing before `1.0000001` (6 zeros) as that comes before the first number after `1.0`. The closest float to your lower bound `1.00000001` (7 zeros) is `1.0`. — Max Langhof, Dec 12 '19 at 09:44
@MaxLanghof Just in case this is the full code for webgl2: https://github.com/khdavid/khdavid.github.io/blob/4e6562e4e9a55714211d205eb8933dee6ed661ce/mandelbrot/experiment.html If you wish you can check your theory — David, Dec 12 '19 at 09:53
Does WebGL or the compiler being used do any autovectorization? If the arithmetic is reformed as 16 threads (or equivalent) of 625,000 multiplications whose results are then multiplied together, the result (using round-to-nearest-ties-to-even) is about 3.15748, which is, to two significant digits, the 3.2 reported in the question. Also, an optimizer that recognizes repeated multiplication is exponentiation and replaces it with `powf(NEAR_ONE, n)` would about get 3.29397, which satisfies the `(result > 3.2) && (result < 3.3)` test. — Eric Postpischil, Dec 12 '19 at 12:37
(In this regard, observe the WebGL result is more accurate than the C++ result. That is, the result of the multiplications would be near 3.29397 if computed exactly. So the C++ calculations are losing accuracy due to rounding, whereas the WebGL result loses less, and the `powf` result loses even less.) — Eric Postpischil, Dec 12 '19 at 12:43
@EricPostpischil . The result should be very close to the e (~2.718...) https://www.wolframalpha.com/input/?i=1.0000001%5E10000000 — David, Dec 12 '19 at 12:45
@David: No, it should not. In `const float NEAR_ONE = 1.0000001`, the source text `1.0000001` is rounded during conversion to 32-bit floating-point to 1.00000011920928955078125. The program then attempts to compute (1.00000011920928955078125)*1e7, not (1+1e-7)**1e7. — Eric Postpischil, Dec 12 '19 at 12:46
If you want to avoid rounding errors during the preparation of `NEAR_ONE`, then use 1 plus a negative power of two, such as `1 + 0x1p-24`, and use `1<<24` for the loop bound instead of 10000000. (In an older compiler without hexadecimal floating-point, use `1 + 1./16777216` instead of `1 + 0x1p-24`.) — Eric Postpischil, Dec 12 '19 at 12:49
@EricPostpischil So now I understand that on my artificial example the WebGL is computing more precise than C++. But I still have a feeling that WebGL is computing less precise in average. I was trying to compute the Mandelbrot fractal on WebGL and saw very bad results if one compare with OpenGL computations. You can see my actual code for the Mandelbrot calculations here: https://github.com/khdavid/khdavid.github.io/blob/644c32c1df18ae59889433e46cdfb44a11036f73/mandelbrot/glMandelbrot.js — David, Dec 12 '19 at 12:54

gman · Answer 1 · 2019-12-13T07:20:49.543

OpenGL ES 3.0 on which WebGL2 is based does not require floating point on the GPU to work the same as it does in C++

From the spec

2.1.1 Floating-Point Computation

The GL must perform a number of floating-point operations during the course of its operation. In some cases, the representation and/or precision of such operations is defined or limited; by the OpenGL ES Shading Language Specification for operations in shaders, and in some cases implicitly limited by the specified format of vertex, texture, or renderbuffer data consumed by the GL. Otherwise, the representation of such floating-point numbers, and the details of how operations on them are performed, is not specified. We require simply that numbers’ floating point parts contain enough bits and that their exponent fields are large enough so that individual results of floating-point operations are accurate to about 1 part in 10⁵ . The maximum representable magnitude for all floating-point values must be at least 2³² . x· 0 = 0 ·x = 0 for any non-infinite and non-NaN x. 1 ·x = x· 1 = x. x + 0 = 0 + x = x. 0 0 = 1. (Occasionally further requirements will be specified.) Most single-precision floating-point formats meet these requirements.

Just for fun let's do it and print the results. Using WebGL1 so can test on more devices

function main() {
  const gl = document.createElement('canvas').getContext('webgl');
  const ext = gl.getExtension('OES_texture_float');
  if (!ext) { return alert('need OES_texture_float'); }
  // not required - long story
  gl.getExtension('WEBGL_color_buffer_float');

  const fbi = twgl.createFramebufferInfo(gl, [
    { type: gl.FLOAT, minMag: gl.NEAREST, wrap: gl.CLAMP_TO_EDGE, }
  ], 1, 1);
  
  const vs = `
  void main() {
    gl_Position = vec4(0, 0, 0, 1);
    gl_PointSize = 1.0;
  }
  `;
  const fs = `
  precision highp float;
  void main() {
    const float NEAR_ONE = 1.0000001;
    float result = NEAR_ONE;

    for (int i = 0; i < 10000000; i++) {
      result = result * NEAR_ONE;
    } 
    
    gl_FragColor = vec4(result);
  }
  `;
  
  const prg = twgl.createProgram(gl, [vs, fs]);
  gl.useProgram(prg);
  gl.viewport(0, 0, 1, 1);
  gl.drawArrays(gl.POINTS, 0, 1);
  const values = new Float32Array(4);
  gl.readPixels(0, 0, 1, 1, gl.RGBA, gl.FLOAT, values);
  console.log(values[0]);
 }
 
 main();

<script src="https://twgljs.org/dist/4.x/twgl.js"></script>

My results:

Intel Iris Pro          : 2.884186029434204
NVidia GT 750 M         : 3.293879985809326
NVidia GeForce GTX 1060 : 3.2939157485961914
Intel UHD Graphics 617  : 3.292219638824464

This merely says there can be something in WebGL2 that behaves differently. It does not tell us what it is. — Eric Postpischil, Dec 13 '19 at 04:08
It's specifically says it's not specified which is the point. *" the representation of such floating-point numbers, and the details of how operations on them are performed, **is not specified**."* As long as the implementation matches he precision metioned above, how it does it is up to the GPU. Different GPUs use different methods since they are competing on speed and price. — gman, Dec 13 '19 at 05:02
Sure, the GPU may be conforming to the specification. That may be your point. But it still leaves us uninformed about what the GPU actually is doing. The fact that a specification does not specify particular behavior does not mean we cannot inquire further and seek understanding of what is happening. It would be useful to know what precision the GPU is using and whether the observed results arise out if that precision or some other cause. The results you added suggest the implementations producing results around 3.29 are using 64-bit floating point. — Eric Postpischil, Dec 13 '19 at 08:23
@gman Thank you for your answer. In this specs https://www.khronos.org/registry/OpenGL/specs/es/3.0/GLSL_ES_Specification_3.00.pdf in the chapter 4.5.1 there is a table, where it is written that +,- and * operations must be "correctly rounded". Does it mean that on these operations the precision should follow the IEEE-754 standard? — David, Dec 13 '19 at 09:02
@EricPostpischil [This](https://www.khronos.org/registry/OpenGL-Refpages/es2.0/xhtml/glGetShaderPrecisionFormat.xml) should give the precision, but I didn't find anything about the rounding method (also implementation defined), which I still suspect should have an impact on the final [result](https://godbolt.org/z/yTq-7_). — Bob__, Dec 13 '19 at 09:02
If I change from using a constant to [using a uniform](https://jsfiddle.net/greggman/65Lzy3jf/) then the answer changes from 3.2 to 2.8 on Intel UHD Graphics 617 (not near other computers ATM) so it's possible the 3.2 versions are being optimzed to just a precomputed constant by some drivers/browsers though it's interesting how they are diff different past 3 decimal places. — gman, Dec 13 '19 at 09:10
@EricPostpischil Also those [results](https://godbolt.org/z/yTq-7_) makes me think about a possible difference in precision between a stored value and a "calculated" one (ALU registers may have a greater precision). — Bob__, Dec 13 '19 at 09:11
note: I checked `getShaderPrecisionFormat` across GPUs. All four report the same thing. — gman, Dec 13 '19 at 09:15

score 0 · Answer 2 · answered Dec 12 '19 at 17:57

0

The difference is precision. In fact, if you compile the c++ fragment using double (64-bit floating-point, with 53-bit mantissa) instead of float (32-bit floating-point, with 24-bit mantissa), you obtain as result 3.29397, which is the result you get using the shader.

answered Dec 12 '19 at 17:57

luisp

91
5

Can you show, from documentation or trial, that the WebGL2 implementation, is using more precision? – Eric Postpischil Dec 13 '19 at 04:07
The proof that WebGL2 is using more precision in this case is that the result of the oriexperiment is the result when using double precision. In the specification, however, highp implies using _at least_ 32 bits, but it does not forbid to use more than that. What is happening here is that the implementation is using 64 bits, thus more that the 32 bits of the C++ float. – luisp Dec 13 '19 at 12:51

Different results in floating-point calculations on WebGL2 and C++

2 Answers2

2.1.1 Floating-Point Computation