1

I have been trying to write some error-protection clauses for identifying problems in a dll which is provided to us by an third party. There may be problems in this dll (memory exceptions, floating point errors, etc), and it is advantageous to be able to identify these errors without access to the source code.

I have something put together from various SEH error handling routines, but although it works, there are several... inconsistencies with it. I'm trying to isolate each one, and I'm going to ask a question on each one individually.

This one is to do with the GetExceptionCode, used in the SEH __try/__except clause to identify the error. It doesn't seem to do so reliably.

This is a clear divide-by-zero case:

#include <float.h>      // defines of _EM_OVERFLOW, etc.
#include <string.h>     // strncpy_s & strncat_s
#include <stdlib.h>     // malloc
#include <excpt.h>      // EXCEPTION_EXECUTE_HANDLER
#include <iostream>     // cout
#include <bitset>       // bitset
#include <conio.h>      // _kbhit
#pragma fenv_access (on)


const unsigned int SERIOUS_FP_EXCEPTIONS = _EM_DENORMAL | _EM_ZERODIVIDE | _EM_INVALID;
const unsigned int MINOR_FP_EXCEPTIONS = _EM_OVERFLOW | _EM_UNDERFLOW | _EM_INEXACT;

int main(int argc, char[])
{
    double numerator = 1.0;
    double denominator = 0.0;
    double result = 0.0;

    unsigned int _previous_floating_point_control;
    _controlfp_s(&_previous_floating_point_control, 0, 0);
    _controlfp_s(nullptr, MINOR_FP_EXCEPTIONS, _MCW_EM);
    __try {
        result = numerator / denominator;
        _controlfp_s(NULL, _previous_floating_point_control, _MCW_EM);
    }
    __except (EXCEPTION_EXECUTE_HANDLER)
    {
        std::cout << "_EM_INEXACT    = " << std::bitset<32>(_EM_INEXACT) << std::endl;
        std::cout << "_EM_UNDERFLOW  = " << std::bitset<32>(_EM_UNDERFLOW) << std::endl;
        std::cout << "_EM_OVERFLOW   = " << std::bitset<32>(_EM_OVERFLOW) << std::endl;
        std::cout << "_EM_ZERODIVIDE = " << std::bitset<32>(_EM_ZERODIVIDE) << std::endl;
        std::cout << "_EM_INVALID    = " << std::bitset<32>(_EM_INVALID) << std::endl;
        std::cout << "_EM_DENORMAL   = " << std::bitset<32>(_EM_DENORMAL) << std::endl;
        std::cout << "_EM_AMBIGUOUS  = " << std::bitset<32>(_EM_AMBIGUOUS) << std::endl;
        std::cout << std::endl;
        std::cout << "                                      divide-by-zero" << std::endl;
        std::cout << "                                             |" << std::endl;
        std::cout << "            ambiguous code?                underflow" << std::endl;
        std::cout << "                  |                          : |" << std::endl;
        std::cout << "                  v                          v v" << std::endl;
        std::cout << "Exception code = " << std::bitset<32>(GetExceptionCode()) << std::endl;
        std::cout << "                             ^              ^ ^ ^" << std::endl;
        std::cout << "                             |              : : |" << std::endl;
        std::cout << "                     denormal number     inexact number" << std::endl;
        std::cout << "                                            : |" << std::endl;
        std::cout << "                                          overflow" << std::endl;
        std::cout << "                                            |" << std::endl;
        std::cout << "                                     invalid number" << std::endl;

        if (GetExceptionCode() & _EM_ZERODIVIDE)
            std::cout << "ERROR! Divide By Zero!" << std::endl;
        else
            std::cout << "No divide by zero found here!" << std::endl;
        _controlfp_s(NULL, _previous_floating_point_control, _MCW_EM);
    }

    std::cout << "result = " << result << std::endl;

    while (!_kbhit())   // Wait until a key is pressed to close console.
    { }
}

And this prints the following:

_EM_INEXACT    = 00000000000000000000000000000001
_EM_UNDERFLOW  = 00000000000000000000000000000010
_EM_OVERFLOW   = 00000000000000000000000000000100
_EM_ZERODIVIDE = 00000000000000000000000000001000
_EM_INVALID    = 00000000000000000000000000010000
_EM_DENORMAL   = 00000000000010000000000000000000
_EM_AMBIGUOUS  = 10000000000000000000000000000000

                                      divide-by-zero
                                             |
            ambiguous code?                underflow
                  |                          : |
                  v                          v v
Exception code = 11000000000000000000001010110101
                             ^              ^ ^ ^
                             |              : : |
                     denormal number     inexact number
                                            : |
                                          overflow
                                            |
                                     invalid number
No divide by zero found here!
result = 0

It has identified a problem (great), but hasn't diagnosed it quite correctly.

Worse still, when the clause is replaced with a call to a dll which is missing a dependency, I get:

                       f.p. exceptions
     denormal number         |
            |               _|_
            v              /   \
11000000011011010000000001111110
         ^^  ^ ^         ^^
         ||  | |         ||
         \________________/
           unknown codes

A similar result is returned in the case of a SIGSEV error (segmentation fault). This means that we're misdiagnosing other problems as floating point exceptions.

So my questions are:

  1. Is this general approach correct, or am I misunderstanding something fundamental?
  2. Why is this not picking up the simple case of a divide-by-zero? Is it hardware dependent?
  3. Can I find out what the rest of the error bits are coming from GetExceptionCode() - that would be really useful.

PS: Please don't comment or reply to say that I should check whether the denominator is 0 - I know, and I do this in all the code I have control over.

Mike Sadler
  • 1,750
  • 1
  • 20
  • 37
  • 2
    It seems you're thinking that `_EM*` from `float.h` represents the different bits of `GetExceptionCode()`. Well, it doesn't. They're unrelated. – MSalters Jan 06 '17 at 13:18
  • I've just seen that in oreubens' answer below - I'm trying the EXCEPTION_FLT_UNDERFLOW one's for my checking instead... – Mike Sadler Jan 06 '17 at 14:12

2 Answers2

1

You will need something along the lines of

DWORD exception_filter(DWORD dwExceptionCode)
{
    // use dwExceptionCode to handle only the types of exceptions you want
    // if you want to use it inside your handler, you'll need to save it.
    return EXCEPTION_EXECUTE_HANDLER; // or other value depending on ExceptionCode
}

Your exception handler...

__try
{
    something();
}
__except (exception_filter(GetExceptionCode())
{
    // DO NOT CALL GetExceptionCode() or GetExceptionInfo() here. If you need
    // Exception Info as well, pass it to the filter, and save the values you need.
    switch (dwSavedExceptionCode)
    {
        case EXCEPTION_FLT_OVERFLOW:
              ItWasAFloatingPointOverflow(); 
              break;
        case EXCEPTION_FLT_DIVIDE_BY_ZERO:
              ItWasAFloatingDivideByZero(); 
              break;
        case ***Other Exception Types You Want handled (==the ones you returned execute_handler for in the filter) ***
              break;
    }
}
oreubens
  • 329
  • 1
  • 9
  • That looks like the "classic" arrangements, but without the _controlfp_s it doesn't register the floating point errors at all (I'm guessing that this is because it is built in Release, and the default is not to check). The big difference is that you're using EXCEPTION_FLT_OVERFLOW and not _EM_OVERFLOW - and they have different values. I'm going to give that a go! – Mike Sadler Jan 06 '17 at 14:09
  • Sorry, I was trying to put code in here, and it really didn't work. I put the switch statement in exception_filter, but otherwise followed the logic above. However, all of the problems now slip through to the default - EXCEPTION_FLT_DIVIDE_BY_ZERO and so forth are not matched by the case. – Mike Sadler Jan 06 '17 at 15:58
  • I've marked this as the answer, as it shows the general form, but I needed to add @Remy Lebeau 's code in to the switch as well in order to get it to work correctly. – Mike Sadler Jan 10 '17 at 11:03
1

Exception code = 11000000000000000000001010110101

That value is 0xC00002B5, aka STATUS_FLOAT_MULTIPLE_TRAPS.

See Why after enabling floating point exceptions I got multiple traps error.

Community
  • 1
  • 1
Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • That did the trick! Do you know why it doesn't appear in the list of exception codes in MinWinBase.h like the others? I had thought I'd struck gold with that file... – Mike Sadler Jan 10 '17 at 11:02
  • @MikeSadler it is an NT status code, which is defined in `ntstatus.h`. Many of the `EXCEPTION_...` codes are just alias for corresponding status codes. – Remy Lebeau Jan 10 '17 at 15:31