This happens if parent crashes after cloning child process, but before sending the unblocking byte with SendContinueSignalToChild()
. In this case pipe file handle remains opened and child stays infinitely blocked on read(...)
within WaitForContinueSignal()
. After the crash, child is adopted by init process.
Steps to reproduce:
l. Simulate parent crash in google_breakpad::ExceptionHandler::GenerateDump(CrashContext *context)
:
...
const pid_t child = sys_clone(
ThreadEntry, stack, CLONE_FILES | CLONE_FS | CLONE_UNTRACED, &thread_arg, NULL, NULL, NULL);
int r, status;
// Allow the child to ptrace us
sys_prctl(PR_SET_PTRACER, child, 0, 0, 0);
int *ptr = 0;
*ptr = 42; // <------- Crash here
SendContinueSignalToChild();
...
- Send one of the handled signal to the parent (e.g. SIGSEGV), so that the above
GenerateDump(...)
method is envoked. - Observe that parent exits but child still exists, blocked on
WaitForContinueSignal()
.
Output for the above steps:
dmytro@db:~$ ./test &
[1] 25050
dmytro@db:~$ Test: started
dmytro@db:~$ ps aflxw | grep test
0 1000 25050 18923 20 0 40712 2680 - R pts/37 0:13 | | \_ ./test
0 1000 25054 18923 20 0 6136 856 pipe_w S+ pts/37 0:00 | | \_ grep --color=auto test
dmytro@db:~$ kill -11 25050
[1]+ Segmentation fault (core dumped) ./test
dmytro@db:~$ ps aflxw | grep test
0 1000 25058 18923 20 0 6136 852 pipe_w S+ pts/37 0:00 | | \_ grep --color=auto test
1 1000 25055 1687 20 0 40732 356 pipe_w S pts/37 0:00 \_ ./test
1687 is the init pid.
In the real world the crash happens in a thread parallel to the one that handles signal.
NOTE: the issue can also happen because of normal program termination (i.e. exit(0)
is called in a parallel thread).
Tested on Linux 3.3.8-2.2., mips and i686 platforms.
So, my 2 questions:
- Is it the expected behavior for the breakpad library to keep child alive? My expectation is that child should exit immediately after parent crashes/exits.
- If it is not expected behavior, what is the best solution to finish client after parent crash/exit?
Thanks in advance!