1

I'm getting this weird behaviour from an executable compiled with different versions of gcc, all emit the SIGFPE signal and the best part is that I have no floating point of any kind in my code; if someone could shed some light on this ... I literally don't know where to start to debug this, it's so weird and this bug is triggered by all the gcc installations that I have from 4.9 to 6.0.

Here is a snippet that reproduces the problem

// Floating point exception - SIGFPE
#include <stdio.h>
typedef unsigned int T;
int main()
{
#define N 256 
  for (T i = 0; i < N; ++i)
    {
      i += (i % i);
      printf("%u\t", i);
    }
}
// bug uncovered with
// gcc version 4.9.2 (Debian 4.9.2-10)
// gcc version 5.1.0 (GCC)
// gcc version 6.0.0 20150517 (experimental) (GCC)
// using -std=c11 or -std=c99

The purpose of this code is to reproduce the problem, I know that the logic of it doesn't really make too much sense ( the modulo part ) but clang passes the test, no version of gcc does the same and I would like to know why if there is a technical explanation for this kind of behaviour .

gzp
  • 13
  • 5
  • 2
    `i % i` divide by 0. – BLUEPIXY Jun 03 '15 at 23:29
  • @BLUEPIXY a number modulo itself should result in `0`, I can't see the "division" you are mentioning or the problem in general . There should be no problem in adding the result of a modulo operation . – gzp Jun 03 '15 at 23:32
  • Division is done when determining the remainder. try `i += (i % i);` change to `i += 0;` – BLUEPIXY Jun 03 '15 at 23:33
  • @BLUEPIXY can you post a complete answer and expand on that? I don't really get what you are mentioning. – gzp Jun 03 '15 at 23:36
  • 3
    @gzp: `0 % 0` is undefined (just like `0 / 0`), so generates a SIGFPE... – Chris Dodd Jun 03 '15 at 23:38
  • @ChrisDodd ok, I get the math, I don't get why floating point computation gets involved in this at all when I use an `unsigned int`. – gzp Jun 03 '15 at 23:40
  • Does it still behave badly without the typedef? – wildplasser Jun 03 '15 at 23:42
  • @ChrisDodd to be clear my point is not about the logic of the math involved, it's about why something related to floating point computation kicks in while doing integer math . – gzp Jun 03 '15 at 23:45
  • 5
    @gzp: No floating point is invovled. Its a historical accident that POSIX calls the signal `SIGFPE` even though it is generated for all numerical exceptions, both integer and floating point. `SIGFPE` has subcodes `FPE_INTDIV` (div/mod by zero) and `FPE_INTOVF` (overflow) which can be generated by integer instructions. – Chris Dodd Jun 03 '15 at 23:46
  • @ChrisDodd that's just like watching the world burn ... it makes no sense at all even as an historical accident; how do you confuse floating-point with integers and manage to smash them together under the same label ? It really is an achievement . – gzp Jun 03 '15 at 23:53
  • @gzp: Do you understand the phrase "historical reasons"? Just take it as a general arithmetic exception. Chainging it would break too much existing legacy software. Banking software for instance does notoriously rely on established behaviour. You're welcome to re-write billions of code (including COBOL and FORTRAN). Further explanation: decent CPUs only had floating point exceptions, but none for integer division. That was actually catched by program code. – too honest for this site Jun 04 '15 at 00:02
  • @Olaf I would really like to know how this happened, there must be a real explanation of what happened in that room where someone wrote specs for this signals ... – gzp Jun 04 '15 at 00:04
  • @gzp: How about searching the web, digging in historical stuff (news archives might be a good start). But do not annoy or blame ppl here, it is not their fault (well, unless there is one here who actually defined that - possible, but not likely). You already got an answer to your question. That is not a forum! – too honest for this site Jun 04 '15 at 00:07
  • @Olaf I'm not blaming anyone, I'm just trying to expand and discuss – gzp Jun 04 '15 at 00:10
  • @gzp: C11 draft, 6.5.5#5: "The result of the / operator ... the result of the % operator is the remainder. In both operations, if the value of the second operand is zero, the behavior is undefined." So, you can get anything. Do not complain if demons crawl up your keyboard. – too honest for this site Jun 04 '15 at 00:11
  • @gzp: Just again: This is a Q&A site, not a discussion forum! You've got your answer on two levels now (C and POSIX). – too honest for this site Jun 04 '15 at 00:12

2 Answers2

2

After running the code, this was under cygwin, gdb dumped the trace.

$ cat sigfpe.exe.stackdump
Exception: STATUS_INTEGER_DIVIDE_BY_ZERO at rip=00100401115
rax=0000000000000000 rbx=000000000022CB20 rcx=0000000000000001
rdx=0000000000000000 rsi=000000060003A2F0 rdi=0000000000000000
r8 =0000000000000000 r9 =0000000000000000 r10=0000000000230000
r11=0000000000000002 r12=0000000000000000 r13=0000000000000001
r14=000000000022CB63 r15=000000000022CB64
rbp=000000000022CAD0 rsp=000000000022CAA0
program=C:\cygwin64\home\luser\sigfpe.exe, pid 6808, thread main
cs=0033 ds=002B es=002B fs=0053 gs=002B ss=002B
Stack trace:
Frame        Function    Args
0000022CAD0  00100401115 (00000000020, 30001000000FF00, 0018004830F, 0000022D680                                                                                                                )
0000022CBC0  00180048380 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  0018004607C (00000000000, 0003E704021, 00000000000, 0000000002D)
00000000000  00180046114 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  00100401191 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  00100401010 (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  000772E59CD (00000000000, 00000000000, 00000000000, 00000000000)
00000000000  0007741B981 (00000000000, 00000000000, 00000000000, 00000000000)
End of stack trace

The clue is in the operation i += (i % i)

when the loop is initial value of 0, of course, divide by zero error.

Have you tried to catch the signal?

Look at the C11 standard on Page 265, SIGFPE - an erroneous arithmetic operation, such as zero divide or an operation resulting in overflow

It is not a compiler bug, that is implementation defined.

t0mm13b
  • 34,087
  • 8
  • 78
  • 110
  • no, I just run this inside `gdb` and got the `SIGFPE` which was the weirdest possible possible for me and I didn't check further because I was mislead by the naming of this signal and convinced that the problem was somewhere else or part of a gcc bug . naming an integer-related signal like that is a really a poor choice . – gzp Jun 03 '15 at 23:51
  • by the way `clang` optimize this out completely, probably some dead code elimination, there is no trace of the modulo operation in the assembly generated by `clang`, that's why it runs . – gzp Jun 03 '15 at 23:55
  • It is not a bug at all, in fact, its standard, [C 11 Standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf), on page 265, *SIGFPE - an erroneous arithmetic operation, such as zero divide or an operation resulting in overflow* – t0mm13b Jun 03 '15 at 23:59
  • yes, my problems start when you starting naming your dog "cat", and your cat "dog" . this was a discovery for me and I really think it's just an error ; it's not even about the math of the modulo at this point, it should be considered an error emitting a signal with that name in this case. – gzp Jun 04 '15 at 00:07
  • It actually is UB: 6.5.5#5; And 7.14#4 makes this signal (among most others) optional. – too honest for this site Jun 04 '15 at 00:21
  • @gzp SIGSEGV covers all sorts of memory access errors too, not just accessing the wrong segment. You can only pack so much info into 3 or 4 letters, at the end of the day there's no substitute for reading the documentation of what that signal means. – M.M Jun 04 '15 at 00:24
  • @MattMcNabb: well, if 17576 or 456976 combinations are not enough, we might include other unicode letters. "umlauts" come into my mind ;-)) – too honest for this site Jun 04 '15 at 00:26
0

Do not complain if demons crawl up your keyboard when using undefined behaviour (UB):

C11 draft, 6.5.5#5: "The result of the / operator ... the result of the % operator is the remainder. In both operations, if the value of the second operand is zero, the behavior is undefined."

UB can be anything. You should actually be happy to get an exception, whatever it is called (it actually does even show the correct reason) and not just have the program produce wrong results unnoticed (the worst that can actually happen!). For many CPUs, you will not notice anything. Just enable compiler warnings; that might help detect such cases (yet not guranteed).

too honest for this site
  • 12,050
  • 4
  • 30
  • 52