1

I am getting a weird error for the following code:

#include <assert.h>
#include <stdio.h>
#include <immintrin.h>

inline static double myfma(double x,double y, double z) {
    double r; // result                                                                                                                                                     
    __m128d xx, yy, zz,rr;

    xx = _mm_set_sd(x);// xx[0]=x, xx[1]=undefined                                                                                                                          
    yy = _mm_set_sd(y);// yy[0]=y, yy[1]=undefined                                                                                                                          
    zz = _mm_set_sd(z);// zz[0]=z, zz[1]=undefined                                                                                                                          
    r = _mm_cvtsd_f64(_mm_fmadd_pd(xx,yy,zz));

    return r;
}

void testfma() {
    double x, y, z, res;
    x = 1.0;
    y = 2.0;
    z = 3.0;

    res =  myfma(x,y,z);
    printf("test: res = x*y + z \n");
    printf("    x: %g\n", x);
    printf("    y: %g\n", y);
    printf("    z: %g\n", z);
    assert(res == 5.0);
}


int main() {
    testfma();
    return 0; 
}

compiling the code as:

g++ test.cpp -o a.out -std=c++11 -mavx2 -mfma  -march=native -g

when I run the executable I am getting the message:

Illegal instruction (core dumped)

Using gdb in order to get further details:

gdb ./a.out
(gdb) r
(gdb) r
Starting program: ....

Program received signal SIGILL, Illegal instruction.
0x000000000040067d in _mm_fmadd_pd(double __vector(2), double __vector(2), double __vector(2)) (__C=..., __B=..., __A=...)
    at /usr/lib/gcc/x86_64-linux-gnu/5/include/fmaintrin.h:42
42                                                 (__v2df)__C);

However when using valgrind as follows:

valgrind ./a.out
==9825== Memcheck, a memory error detector
==9825== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==9825== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright 

info
==9825== Command: ./helios.x
==9825== 
test: res = x*y + z 
    x: 1
    y: 2
    z: 3
    res: 5
==9825== 
==9825== HEAP SUMMARY:
==9825==     in use at exit: 0 bytes in 0 blocks
==9825==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==9825== 
==9825== All heap blocks were freed -- no leaks are possible
==9825== 
==9825== For counts of detected and suppressed errors, rerun with: -v
==9825== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

The program seems to be working. What I am missing here? How can I use _mm_fmadd_pd in a robust way? Is possible to make the example to works regardless of being running in a Intel or AMD processor? is possible to make it compile regardless using g++ or icpc?

Paul R
  • 208,748
  • 37
  • 389
  • 560
  • Are you sure your CPU has FMA ? Also, try compiling with `-g` so that when you crash in gdb you can see what's going on... – Paul R Mar 17 '17 at 09:55
  • Using the -g flag I get the error: (gdb) r Starting program: ...... Program received signal SIGILL, Illegal instruction. 0x000000000040067d in _mm_fmadd_pd(double __vector(2), double __vector(2), double __vector(2)) (__C=..., __B=..., __A=...) at /usr/lib/gcc/x86_64-linux-gnu/5/include/fmaintrin.h:42 42 (__v2df)__C); – Rubén Darío Guerrero Mar 17 '17 at 10:02

1 Answers1

0

My guess is that your CPU does not support FMA instructions. The reason it does not fail under valgrind is because valgrind can emulate certain instructions.

You might want to consider using std::fma if you only want SISD. With gcc it generates an inline FMA instruction, but if you compile for a non-FMA target then it will fall back to a non-FMA implementation.

Paul R
  • 208,748
  • 37
  • 389
  • 560