How to enable long double in Intel ICC

Question

I'm on MacOS, and am following the guide Using the Intel® Math Library to implement the first example:

// real_math.c
#include <stdio.h> 
#include <mathimf.h>

int main() {
 float fp32bits;
 double fp64bits;
 long double fp80bits;
 long double pi_by_four = 3.141592653589793238/4.0;

// pi/4 radians is about 45 degrees
 fp32bits = (float) pi_by_four; // float approximation to pi/4
 fp64bits = (double) pi_by_four; // double approximation to pi/4
 fp80bits = pi_by_four; // long double (extended) approximation to pi/4

// The sin(pi/4) is known to be 1/sqrt(2) or approximately .7071067 
 printf("When x = %8.8f, sinf(x) = %8.8f \n", fp32bits, sinf(fp32bits));
 printf("When x = %16.16f, sin(x) = %16.16f \n", fp64bits, sin(fp64bits));
 printf("When x = %20.20Lf, sinl(x) = %20.20Lf \n", fp80bits, sinl(fp80bits));

 return 0; 
}

As directed in the guide, I compile with:

icc real_math.c

When I execute, I get:

When x = 0.78539819, sinf(x) = 0.70710677 
When x = 0.7853981633974483, sin(x) = 0.7071067811865475 
When x = 0.78539816339744830952, sinl(x) = 0.00000000000000000000

I've searched wide and far, and all examples seem to show that this should be trivial. What am I missing?

I tried to pass -long_double to icc, but nothing changes.

Are you looking for an option to make `sizeof(long double) > 8`, using x87 80-bit? That should already be the case on GNU/Linux targets, but not by default in the Windows ABI where long-double = double. — Peter Cordes, Oct 13 '20 at 11:14
@PeterCordes • the OP is on MacOS, which is neither a GNU/Linux target nor a Windows ABI target. — Eljay, Oct 13 '20 at 11:17
@DavidRanieri: Presumably `sinl` is getting an all-zero input, or a small integer that represent an 80-bit float close to zero, because the caller disagrees with the callee about where the arg should be passed. Note the all-zero output, which is not mathematically correct, and not what you get for a working compiler like x86-64 GCC targeting Linux with a normal `math.h` - https://godbolt.org/z/z4hjTc — Peter Cordes, Oct 13 '20 at 11:18
Thank you for the godbolt link @PeterCordes, I wasn't aware of this tool. I've set it to replicate my environment, and can see correct output. But when I try this on my local machine, I get 0 for sinl — Seva Safris, Oct 13 '20 at 11:25
@Eljay: Ah right. Same ABI as Linux, x86-64 System V. So `long double` is the 80-bit x87 type. Perhaps Intel's library is using asm functions written for a different ABI, where `long double` = `double` and is passed / returned in XMM0 instead of arg on the stack, ret in `st(0)`. But if the caller tried to `fstp` the return value from `st0` (x87 register), when the x87 register stack was empty, that would store a `-nan` to memory, not `0`. (My earlier comment about a zero *input* to sinl was ignoring the fact that a calling-convention mismatch would be a problem for return values) — Peter Cordes, Oct 13 '20 at 11:36
@SevaSafris: Can you double-check `sizeof(long double)` in your local `icc`? Note that Godbolt's compilers are set up to target GNU/Linux, not MacOS. That should work the same, but possibly not. Also, you could single-step into the `sinl` library function and see if the asm is using x87 instructions or SSE2 / AVX. — Peter Cordes, Oct 13 '20 at 11:39
Just in passing, note that `3.141592653589793238` and `4.0` are both doubles, so the value stored in `pi_by_four` has the precision of a double. To make them long doubles, add a `L`: `3.141592653589793238L` and `4.0L`. (You can also use lower-case l, but nobody does that, because it can look too much like a 1). — Pete Becker, Oct 13 '20 at 12:56

score 0 · Answer 1 · answered Oct 13 '20 at 11:54

Ok, I figured it out. This problem was caused by an error on my part due to the omission of 1 character: L. Specifically, somehow in my own code on my computer I had this:

 printf("When x = %20.20Lf, sinl(x) = %20.20f \n", fp80bits, sinl(fp80bits));

and the correct code is this (note %20.20f vs %20.20Lf):

 printf("When x = %20.20Lf, sinl(x) = %20.20Lf \n", fp80bits, sinl(fp80bits));

I'm not even sure how this could have happened, but this was the root cause.

Thank you everyone for your quick answers, and thanks @PeterCordes for linking me to godbolt.

In future, checking with a debugger (maybe using a tmp var) would have let you see you were getting a result from `sinl`, separate from printf printing it. Also `%f` (unlike `%g`) doesn't distinguish between 0.0 and very tiny floats, so doesn't rule out some kinds of calling-convention mismatches. — Peter Cordes, Oct 13 '20 at 12:35

How to enable long double in Intel ICC

1 Answers1