6

I'm testing code that is designed to detect when a child process has segfaulted. Imagine my surprised when this code does not always segfault:

#include <stdio.h>

int main() {
  char *p = (char *)(unsigned long)0;
  putchar(*p);
  return 0;
}

I'm running under a Debian Linux 2.6.26 kernel; my shell is the AT&T ksh93 from the Debian ksh package, Version M 93s+ 2008-01-31. Sometimes this program segfault but otherwise it simply terminates silently with a nonzero exit status but no message. My signal-detecting program reports the following:

segfault terminated by signal 11: Segmentation fault
segfault terminated by signal 53: Real-time signal 19
segfault terminated by signal 11: Segmentation fault
segfault terminated by signal 53: Real-time signal 19
segfault terminated by signal 53: Real-time signal 19
segfault terminated by signal 53: Real-time signal 19
segfault terminated by signal 53: Real-time signal 19

Running under pure ksh shows that the segfault is also rare:

Running... 
Running... 
Running... 
Running... 
Running... 
Running... Memory fault
Running... 

Interestingly, bash correctly detects the segfault every time.

I have two questions:

  1. Can anyone explain this behavior?

  2. Can anyone suggest a simple C program that will segfault reliably on every execution? I have also triedkill(getpid(), SIGSEGV), but I get similar results.


EDIT: jbcreix has the answer: my segfault detector was broken. I was fooled because ksh has the same problem. I tried with bash and bash gets it right every time.

My error was that I was passing WNOHANG to waitpid(), where I should have been passing zero. I don't know what I could have been thinking! One wonders what is the matter with ksh, but that's a separate question.

Bill the Lizard
  • 398,270
  • 210
  • 566
  • 880
Norman Ramsey
  • 198,648
  • 61
  • 360
  • 533
  • What's wrong with `exit(0)`? If you want the child to exit ... – pmg Nov 14 '09 at 19:42
  • I don't want the child to exit---I want it to segfault. Why? I'm building a segfault detector and I need a way to test it! – Norman Ramsey Nov 15 '09 at 05:12
  • 1
    Norman, I tested it at your request. My code works exactly as I said it would, thanks. Not sure what is wrong on your system, maybe the parent process you are using to catch signals AND ksh? I used kernel 2.6.28.7 to test the mmap answer. Also works with raise(SIG_SEGV). – Heath Hunnicutt Nov 15 '09 at 05:40
  • 1
    It is segfaulting every time and you are just failing to detect it. Even vulnerable Linux systems don't start programs with 0 mmapped by default, and that kill produced the same results, makes it quite clear. Your segfault detector needs more work. If you don't believe it munmap((void*)((quad_t)main/4096*4096),4096); usually programs like having themselves mapped the results are going to be the same. –  Nov 15 '09 at 09:43

3 Answers3

12

Writing to NULL will reliably segfault or bus error.

Sometimes an OS will map a read-only page to the zero address. Thus, you can sometimes read from NULL.

Although C defines the NULL address as special, the 'implementation' of that special status is actually handled by the Operating System's Virtual Memory (VM) subsystem.

WINE and dosemu need to map a page at NULL for Windows compatibility. See mmap_min_addr in the Linux kernel to rebuild a kernel which cannot do this.

mmap_min_addr is currently a hot topic due to a related exploit and a public flame toward Linus (of Linux fame, obviously) from Theo de Raadt, of the OpenBSD effort.

If you are willing to code the child this way, you could always call: raise(SIGSEGV);

Also, you can obtain a guaranteed-to-segfault pointer from: int *ptr_segv = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_PRIVATE | MAP_NORESERVE | MAP_ANONYMOUS, -1, 0);

Where PROT_NONE is the key to reserving memory which cannot be accessed. For 32-bit Intel Linux, PAGE_SIZE is 4096.

mesmerizingr
  • 1,417
  • 1
  • 18
  • 25
Heath Hunnicutt
  • 18,667
  • 3
  • 39
  • 62
  • Heh, the two biggest flamers in OSS meet, it's ... like ... the irresistible force meets the immovable object ... like ... two locomotives crashing head-on ... I wish I could sell tickets – DigitalRoss Nov 14 '09 at 20:00
  • Heath, I'd love to k now if you tested your answer on a Linux system. I can't make it work... – Norman Ramsey Nov 15 '09 at 05:21
  • Heath, I tried these and tried `raise(SIGSEGV)` too. Same problem. Unsurprising as raise(SIGSEGV) is supposed to be equivalent to kill(getpid(), SIGSEGV) on a single-threaded program. Have you tested any of your suggestions? I would love to know if my (awful) results are reproducible. – Norman Ramsey Nov 15 '09 at 05:24
  • PROT_NONE should work, it's the way Java and other systems know when to run the garbage collector. (Allocate by adding to the frontier pointer, if you get a SIGSEGV, you've gone past your space, so run a GC or get more memory.) – Adam Goode Nov 15 '09 at 05:40
  • PROT_NONE does work. Surprised by the claims above, I went to the trouble of verifying it. – Heath Hunnicutt Nov 15 '09 at 05:50
  • 1
    Heath: thanks for doing the testing. Your testing made me dig deeper, and the outcome was that I now have a bug report to file against `ksh`. I wish I had thought of trying `bash` before posting, but I never dreamed David Korn could get this wrong :-) – Norman Ramsey Nov 15 '09 at 21:56
  • @Roman - I don't really approve of that edit. There's no reason to raise a particular signal, rather than have the reader decide which type of signal to raise. Blech. – Heath Hunnicutt Aug 12 '13 at 17:58
1

I'm not sure why it doesn't have consistent behavior. I'd think that it's not as nit-picky with reading. Or something like that, though I'd probably be totally wrong.

Try writing at NULL. This seems to be consistent for me. I have no idea why you'd want to use this though. :)

int main()
{
    *(int *)0 = 0xFFFFFFFF;
    return -1;
}
pbos
  • 476
  • 3
  • 6
1

The answer to question number two from Wikipedia :

 int main(void)
 {
     char *s = "hello world";
     *s = 'H';
 }
Amir Afghani
  • 37,814
  • 16
  • 84
  • 124
  • I disagree with what Wikipedia says. The program *may* create a segmentation fault, but we can't be sure. The behavior is undefined IIRC. – Bastien Léonard Nov 14 '09 at 22:18
  • Yes, some compilers are allowed to place literals in the (read/write) data segment. I remember the shock when gcc made the change. I guess I'm dating myself... – Norman Ramsey Nov 15 '09 at 05:11