Intermittent segmentation faults with straightforward C++ program

Question

I'm currently working through Thinking in C++, and chapter 9, exercise 15 gives instructions to time the difference between inline and non-inline constructors. In doing so, I created a metric shedload of object instances in an array, but when I get up to a certain point, the program begins segfaulting intermittently. I'm not doing anything peculiar, and the number doesn't seem to be magical (close to a power of 2 or anything), so it strikes me as very strange. Indeed, the objects are all very small, containing a single integer.

I'm not using any custom compilation or optimization options, and using standard g++ (not icc or anything).

I'm stumped as heck by this, in what should be a straightforward program. Any insight would be appreciated, as even the strace output (below) doesn't give me any hints.

Thank you in advance.

ex15.cc:

#include <ctime>
#include <iostream>
using namespace std;

class A
{
    static int max_id;
    int id;
public:
    A() { id = ++max_id; }
};
int A::max_id = 0;

class B
{
    A a;
public:
    B() {}
};

int main()
{
    clock_t c1, c2;
    cout << "Before" << endl;
    c1 = clock();
    B b[2093550];   // intermittent segfault around this range
    c2 = clock();
    cout << "After; time = " << c2 - c1 << " usec." << endl;
    getchar();
}

Run log:

$ ./ex15
Before
After; time = 40000 usec.
$ ./ex15
Segmentation fault
$ ./ex15
Before
After; time = 40000 usec.
$ ./ex15
Segmentation fault
$ ./ex15
Before
After; time = 40000 usec.
$ ./ex15
Before
After; time = 40000 usec.
$ ./ex15
Segmentation fault

The strace output shows it dying here:

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7
f93000
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
+++ killed by SIGSEGV +++

And from a successful run:

mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7
f4c000
write(1, "Before\n", 7)                 = 7
times({tms_utime=0, tms_stime=0, tms_cutime=0, tms_cstime=0}) = -1160620642
times({tms_utime=4, tms_stime=0, tms_cutime=0, tms_cstime=0}) = -1160620637
write(1, "After; time = 40000 usec.\n", 26) = 26
fstat64(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7
f4b000
read(0, "\n", 1024)                     = 1
munmap(0xb7f4c000, 4096)                = 0
exit_group(0)                           = ?

Hardware issues? Try running on a different machine to see if you at least get consistent results. — Pablo Santa Cruz, Feb 25 '11 at 02:12
I was considering that and also possibly a compiler issue, as my version of g++ is a bit older, but it seems to have been the stack overflow issue as noted below. Thanks, though! — Sdaz MacSkibbons, Feb 25 '11 at 02:26

score 3 · Answer 1 · answered Feb 25 '11 at 02:11

3

Allocating an array of 2093550 B objects on the stack most probably causes a stack overflow. Dynamically allocate it with new to avoid the segmentation fault.

answered Feb 25 '11 at 02:11

sth

222,467
53
283
367

1

Oh, jeez. It's the name of the site, even! Thank you, mate. – Sdaz MacSkibbons Feb 25 '11 at 02:12
1

Hope you don't mind, I'm going to accept Peter's answer, as you have 43k points already. ;) – Sdaz MacSkibbons Feb 25 '11 at 02:26

Peter Huene · Accepted Answer · 2011-02-25T04:28:36.860

3

If sizeof(B) is 4 bytes, that puts the size of that array (b) at 8374200 bytes. That's pretty close to what I'm guessing is your default maximum thread stack size of 8 MiB (8388608 bytes). So it looks like you're overflowing your stack.

edited Feb 25 '11 at 04:28

answered Feb 25 '11 at 02:12

Peter Huene

5,758
2
34
35

Yes indeed, I changed it as you and sth suggested, and that seems to have been the issue. I also reset my ulimits, leaving the code as above, just to make sure, and that was the case. However, I'm curious why it would be intermittent. The range seemed to be +/- 20 or so objects. It doesn't seem that already-allocated space for the stack should change between runs of the same program in a VM system..? (I know it's not that much, but I'd be curious as to what's going on if you have any ideas.) – Sdaz MacSkibbons Feb 25 '11 at 02:22
1

So I was curious myself as to why it could ever be an intermittent failure. It's my understanding that GCC should be emitting a stack probe because this allocation exceeds the expected page size of 4K. Without a probe, an adjustment to esp over a page in size could skip over the guard page and any access to the stack beyond the guard page would be an unexpected page fault. Interestingly, with "char foo[4096]", gcc emits a stack probe. With "B foo[4096]", it does not (-fstack-check fixes that). It seems odd that GCC doesn't emit a probe for the UDT. – Peter Huene Feb 25 '11 at 05:50
1

With the original source on your failing system, could you supply -fstack-check to the compiler's command line and see if the intermittent failure becomes a consistent failure? – Peter Huene Feb 25 '11 at 05:52
Well, I used the source above (as it's exactly what I was using before), and recompiled with `g++ -fstack-check test.cc -o test`, and after doing a binary search to get to the new failure range, it now segfaults over a range of about 17500-18000 `B` objects allocated onto the stack. That's just shy of about 64k bytes or so it looks like. However, it's still intermittent. What a puzzling thing! At least my original question is solved, for which I thank you again, but seemingly non-deterministic behavior in computing always piques my curiosity! – Sdaz MacSkibbons Feb 25 '11 at 07:01

Intermittent segmentation faults with straightforward C++ program

2 Answers2