2

I am struggling with a stack overflow exception, which occurrs while rethrowing a different exception. The rethrown exception is used to tear down the callstack after a recursive function has called itself more than a certaing number of times. (To prevent a stack overflow from occurring)

I managed to write a small program, which reproduces this issue. It only occurs, when I compile the program with x64 (Release & Debug).
I tested the snippet with MSVS2012 & MSVS2013 (StackSize=1MB - default). With G++ the same problem occurrs after about 5000 calls.

Code:

#include <iostream>
using namespace std;

void recursiveFunction(int childCalls) {
  cout << "Recursive call, left calls: " << childCalls << endl;
  if (childCalls == 0) {
    cout << "Throwing std::exception" << endl;
    throw std::exception("Target depth reached");
  }

  try {
    recursiveFunction(childCalls - 1);
  } catch (std::exception&) {
    cout << "Caught exception at level: " << childCalls << endl;
    throw; //Simply rethrow exception
  }
}

int main() {
  //How many calls cause a stack overflow during unwinding with x64
  const int calls = 120; 

  //How many recursive calls I can make before the call stack overflows due to recursive calls
  //const int calls = 10600; 

  cout << "Initiating " << calls << " recursive calls" << endl;
  recursiveFunction(calls);
}

The program's output is:

Initiating 120 recursive calls
Recursive call, left calls: 120
Recursive call, left calls: 119
Recursive call, left calls: 118
...
Recursive call, left calls: 2
Recursive call, left calls: 1
Recursive call, left calls: 0
Throwing std::exception
Caught exception at level: 1
Caught exception at level: 2
Caught exception at level: 3
...
Caught exception at level: 104
Caught exception at level: 105  <-- Stack overflow here!!!

What I didn't expect was that the stack overflow happens while the callstack has been mostly cleared up and there were only 15 call frames left to clean up. It seems as if rethrowing an exception allocates stack space instead of freeing it up.

When I debug the program with WinDBG and open up the stack frames (knf) at the moment of the stack overflow, I get the following picture:

0:000> knf
 #   Memory  Child-SP          RetAddr           Call Site
00           00000065`d5c37260 00007ffc`5ac10658 MSVCR110D!_chkstk+0x37
01        18 00000065`d5c37278 00007ffc`5ac105bf MSVCR110D!write_nolock+0x18
02         8 00000065`d5c37280 00007ffc`5ab23db1 MSVCR110D!write+0x21f
...
0a        f0 00000065`d5c37710 00007ffc`5ac09150 Crashtest!`recursiveFunction'::`1'::catch$0+0x26
0b        40 00000065`d5c37750 00007ffc`5abf93f2 MSVCR110D!CallSettingFrame+0x20
0c        30 00000065`d5c37780 00007ffc`7dd3a193 MSVCR110D!_CxxCallCatchBlock+0x162
0d        a0 00000065`d5c37820 00007ff7`f14714ca ntdll!RcConsolidateFrames+0x3
0e     f7970 00000065`d5d2f190 00007ff7`f14714ca Crashtest!recursiveFunction+0xba
0f        60 00000065`d5d2f1f0 00007ff7`f14714ca Crashtest!recursiveFunction+0xba
...

Note: A frame's size stands in the second column of the next line
The frame of ntdll!RcConsolidateFrames is 0xf7970 (1.014.128) bytes large, and thus occupies 96% of the total available stack size of 1MB.

What bugs me most is the fact that I can (as noted in the snipped) call the function recursively up to almost 10.600 times before the call stack is used up and another call leads to a stack overflow. But if I abort the recursion with an excption after more than 120 calls, I again get a stack overflow, which this exception was meant to prevent.
So compiling the program with a larger stack only shifts the issue to a slightly higher constant. As mentioned earlier, this problem only occurrs with x64 compilation. If compiled with Win32, the program never runs into a stack overflow, once the std::exception has been thrown.

How does a program make sense, which allocates more stack space during stack unwinding, than it releases?
How can I solve this?
As this is only a very simplified case I cannot simply replace the throw by a special return value in the original application.

Microsoft connect request (deleted)

EDIT: Microsoft simply deleted the request without providing any help. I received the following answer, asking for more details but the request got deleted one day after I replied.

Thanks for reporting this issue. While large stack usage is not ideal, it is a significant consequence of how EH is implemented on Windows non-x86 platforms. One thing to note is that the stack usage under RcConsolidateFrames is a bit misleading. The point of that function is to make the unwinder hide a bunch of intermediate frames, so that stack usage is reflective of 105 instances of the EH mechanism running (one per rethrow that executed), plus all of your recursive child calls.

Could you please share more information about the real-world scenario that this high stack usage is blocking? Fixing this for a general throw from a catch may not be so easy, but it might be possible to improve this if we can make some simplifying assumptions about the scenario that matters.

Thanks,
Neeraj Singh VC++ Compiler Backend Developer

In the mean while I hoped, the problem would lie in the rethrowing mechanism, which rethrows the same exception object, which is allocated on the stack, over and over again. I thought that one of

  • copying the exception and throwing a copy of the exception
  • throwing an exception, created with new

could solve the issue, but there was no difference in the result.

whY
  • 189
  • 1
  • 10
  • 1
    The exception model is very different for [x64](https://msdn.microsoft.com/en-us/library/1eyas8tf.aspx) than it is for x86. You said this is a simple example, but it is difficult for me to imagine how recursively throwing an exception might possibly be smart design. – Cody Gray - on strike May 03 '17 at 13:52
  • @CodyGray The actual application is a language interpreter. The exception is thrown if the interpreted code contains too many nested calls to free up the callstack and gracefully return control to the top level executing event handler. – whY May 03 '17 at 14:22
  • @HansPassant Thanks for the hint to `_resetstkoflw` with that I at least know how to recover from a stack overflow. Could you be a bit more specific on the 8KB stack reserve and what you mean by that? I would expect to have more space available than 8KB after unwinding 105 call stacks. Are you saying that a function, which may be recursively called, **must not** contain a try-catch/rethrow? This seems like a heavy restriction on the language. – whY May 03 '17 at 14:27
  • @HansPassant OK, so my code didn't unwind any frames when the SO occurred. Even if the stack frames of every called function are still on the stack, I don't understand why the stack would overflow because I can call the recursive function up to 10.600 times, before the callstack overflows due to the recursive calls. So the stack should be at most 2% used up, when 120 call frames are on the stack. Why does it change so dramatically when rethrowing an exception up the callstack? What is all this space used for? – whY May 03 '17 at 14:58
  • Sorry, I've been posting very misleading comments, too focused on the SO. It seems to have a hard time unwinding frames past the catch clauses that are active in the nested frames. Not a lot of consolidating going on. Not pretty. Still a problem in VS2017 so technically Microsoft Support can help you dig deeper into the root issue. Or use connect.microsoft.com if it isn't urgent. – Hans Passant May 03 '17 at 15:54

0 Answers0