4

The way to trap floating-point exceptions is architecture-dependent. This is code I have tested successfully on an Intel (x86) Mac: it takes the square root of a negative number twice, once before, and once after, enabling floating-point exception trapping. The second time, fpe_signal_handler() is called.

#include <cmath>        // for sqrt()                                           
#include <csignal>      // for signal()                                         
#include <iostream>
#include <xmmintrin.h>  // for _mm_setcsr                                       

void fpe_signal_handler(int /*signal*/) {
  std::cerr << "Floating point exception!\n";
  exit(1);
}

void enable_floating_point_exceptions() {
 _mm_setcsr(_MM_MASK_MASK & ~_MM_MASK_INVALID);
 signal(SIGFPE, fpe_signal_handler);
}

int main() {
  const double x{-1.0};
  std::cout << sqrt(x) << "\n";
  enable_floating_point_exceptions();
  std::cout << sqrt(x) << "\n";
}

Compiling with the apple-clang compiler provided by Xcode

clang++ -g -std=c++17 -o fpe fpe.cpp

and running gives the following expected output:

nan
Floating point exception!

I would like to write an analogous program that does the same thing as the above program on an M1 (arm64) Mac. I tried the following:

#include <cfenv>        // for std::fenv_t                                      
#include <cmath>        // for sqrt()                                           
#include <csignal>      // for signal()                                         
#include <fenv.h>       // for fegetenv(), fesetenv()                           
#include <iostream>

void fpe_signal_handler(int /*signal*/) {
  std::cerr << "Floating point exception!\n";
  exit(1);
}

void enable_floating_point_exceptions() {
 std::fenv_t env;
 fegetenv(&env);
 env.__fpcr = env.__fpcr | __fpcr_trap_invalid;
 fesetenv(&env);
 signal(SIGFPE, fpe_signal_handler);
}

int main() {
  const double x{-1.0};
  std::cout << sqrt(x) << "\n";
  enable_floating_point_exceptions();
  std::cout << sqrt(x) << "\n";
}

It almost works: After compiling with the apple-clang compiler provided by Xcode

clang++ -g -std=c++17 -o fpe fpe.cpp

I get the following output:

nan
zsh: illegal hardware instruction  ./fpe

I tried adding the -fexceptions flag, but that didn't make a difference. I noticed that the ARM Compiler toolchain "does not support floating-point exception trapping for AArch64 targets," but I'm not sure if this applies to M1 Macs with Apple's toolchain.

Am I correct that the M1 Mac hardware just doesn't support floating-point exception trapping? Or is there a way to modify this program so it traps the second floating-point exception and then calls fpe_signal_handler()?


Synchronously testing for exceptions within the same thread does work fine, using ISO C fetestexcept from <fenv.h> as in the cppreference example. The problem here is getting FP exceptions to actually trap so the OS delivers SIGFPE, instead of just setting sticky flags in the FP environment.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
  • I don't know the answer to your question, but `-fexceptions` controls support for C++ exceptions (`try/throw/catch`) and has nothing to do with floating-point exceptions / traps / signals. – Nate Eldredge Sep 06 '21 at 03:38
  • Good question, I am also surprised by this behavior. – Ibraim Ganiev Mar 16 '22 at 18:12
  • I'd assume the illegal instruction is in `fesetenv(&env);` after `env.__fpcr = env.__fpcr | __fpcr_trap_invalid;`. As a guess, maybe that's what happens when you try to set some exceptions as "unmasked" on hardware that doesn't support it. Without any other way to report failure, the CPU could trap as if you tried to execute an illegal instruction, even though this is really illegal data for a valid instruction? (Assuming `fesetenv` doesn't *always* SIGILL with other operands.) – Peter Cordes Apr 11 '22 at 18:24

2 Answers2

2

Based on the long discussion below (many thanks to the patience of @Peter Cordes), it seems that with MacOS on Aarch64, unmasking FP exceptions and then generating the corresponding bad FP math results in a SIGILL rather than a SIGFPE. A signal code ILL_ILLTRP can be detected in the handler.

#include <fenv.h>    
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <math.h>

static void
fpe_signal_handler( int sig, siginfo_t *sip, void *scp )
{
    int fe_code = sip->si_code;

    printf("In signal handler : ");

    if (fe_code == ILL_ILLTRP)
        printf("Illegal trap detected\n");
    else
        printf("Code detected : %d\n",fe_code);

    abort();
}

void enable_floating_point_exceptions()
{
    fenv_t env;
    fegetenv(&env);

    env.__fpcr = env.__fpcr | __fpcr_trap_invalid;
    fesetenv(&env);

    struct sigaction act;
    act.sa_sigaction = fpe_signal_handler;
    sigemptyset (&act.sa_mask);
    act.sa_flags = SA_SIGINFO;
    sigaction(SIGILL, &act, NULL);
}

void main()
{
    const double x = -1;    
    printf("y = %f\n",sqrt(x));
    enable_floating_point_exceptions();
    printf("y = %f\n",sqrt(x));
}

This results in :

y = nan
In signal handler : Illegal trap detected
Abort trap: 6

Other floating point exceptions can be detected in a similar manner, e.g unmasking __fpcr_trap_divbyzero, and then generating double x=0; double y=1/x. If the exception is not unmasked then the program terminates normally.

Without a SIGFPE, however, it doesn't seem possible to detect exactly which floating point operation triggered the signal handler. One can imagine unmasking all exceptions (e.g. using FE_ALL_EXCEPT). Then, when a bad math op generates a SIGILL, we don't have enough information in the handler to know what operation sent the signal. Unmasking all exceptions and playing around a bit with fetestexcept in the handler didn't produce very reliable results. This might take some more investigation.

Donna
  • 1,390
  • 1
  • 14
  • 30
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/243807/discussion-on-answer-by-donna-how-to-trap-floating-point-exceptions-on-m1-macs). – Samuel Liew Apr 12 '22 at 01:03
0

You can decode the nature of the FPU exception in the SIGILL handler by casting the third argument of your sigaction handler to a struct ucontext*, and then decoding scp->uc_mcontext->__es.esr in your handler. This is the value of the ESR_ELx, Exception Syndrome Register (ELx) register. If its top 6 bits (EC) are 0b101100, then the signal was triggered by a trapping AArch64 FPU operation. If in that case bit 23 of that register (TFV) is also 1, then the register's lower 7 bits will match what the lower 7 bits of the fpsr register would have been in case trapping would have been disabled (i.e., the kind of FPU exception triggered by the instruction).

See section D17.2.37 of the ARMv8 architecture manual for more information ("ESR_EL1, Exception Syndrome Register (EL1)").