32

For a moment I was very proud of myself to have written my probably first C bug-free program. Here is the entire source code:

int main;

It compiles perfectly even without the int, but a warning is issued (even without -Wall) and, as a programmer who is aiming at a bug-free program, I treat them as errors.

Having happily compiled this application, I immediately rushed to launch it. To my surprise, a segmentation fault error appeared...


Now seriously. What is exactly happening?

My guess is as follows: it's the lack of main's definition. This is so obvious and yet the compiler permits it. OK, main may be defined in a different unit. But even the linker doesn't do anything about it. Any particular reason why?

emesx
  • 12,555
  • 10
  • 58
  • 91
  • I think the problem is that you just define the prototype but not the function itself, but wait: You just define a variable and no functions. So far the entry point is not defined. – rekire Mar 11 '13 at 07:24
  • 7
    This [explanation on Reddit](http://www.reddit.com/r/programming/comments/19xyw1/some_dark_corners_of_c_presentation/c8swcpi?context=2) might help. – DCoder Mar 11 '13 at 07:27
  • It depends on how and where you compile your program. Hosted environment programs need a `main` but Freestanding environment programs don't. – Alok Save Mar 11 '13 at 07:29
  • Your program is not valid *C* (in hosting environment programs). The standard requires `main` to be a function. You are getting undefined behavior. The linker gets a `main` symbol defined. Then the startup code `crt0.o` is calling that, hence crashing. – Basile Starynkevitch Mar 11 '13 at 07:30
  • If it's segmentation fault I think (not sure) that it's because lack of return code. You should use GCC to also generate the assembly and see exactly what's going on. – MasterMastic Mar 11 '13 at 07:30
  • @BasileStarynkevitch: Nopes. Only hosted environment programs need a main. – Alok Save Mar 11 '13 at 07:30
  • @DCoder the reddit thread is actually interesting! Didn't see it before. Thanks a lot! – emesx Mar 11 '13 at 07:40
  • 3
    Regarding warnings: `gcc -Wall -g -o main main.c` gives me `main.c:1: warning: ‘main’ is usually a function` (using `gcc (Debian 4.4.5-8) 4.4.5`). – alk Mar 11 '13 at 07:49
  • @alk +1, didn't use Wall – emesx Mar 11 '13 at 08:08
  • 1
    @DCoder you (or the original poster) should probably cite the linked explanation as an answer to this question. – moooeeeep Mar 11 '13 at 08:12

1 Answers1

21

The word main is a legal name for any variable. The typical use case is to provide a function of the name main to a compiler, which compiles it to an object file, which in turn is linked to with crt0.o that provides initialization for run-time (stack allocation etc.) and jumps to the label main.

In C object files the symbols are not associated with prototypes and the linker succeeds in linking a global variable int main; as the main program to be jumped to. This program, however, is garbage. It's most likely initialized as zeros, but soon the processor encounters either a random instruction that accesses memory outside the programs allocated data space (stack + heap), or the instruction flow reaches the limits of the reserved code space.

Both will cause a segmentation fault. And actually, if the system runs on an architecture with eXecution flags, the program segfaults at the first attempt to jump to data segment or page without execution permission.

Further reading to support the discussion in the comments: Data Execute Prevention, NX_bit

Aki Suihkonen
  • 19,144
  • 1
  • 36
  • 57
  • In MS-DOS era I suppose, it was possible to write a working `char main=0xc3;` – Aki Suihkonen Mar 11 '13 at 07:48
  • 5
    try `const char main=0xc3;` – moooeeeep Mar 11 '13 at 08:00
  • Right, or even `const main=195;` – Aki Suihkonen Mar 11 '13 at 08:14
  • On any decent modern OS though, `const char main=0xc3` will probably segfault anyway because the data section of the executable isn't supposed to be executable. – tangrs Mar 11 '13 at 10:14
  • 4
    Works on x64 Ubuntu. The point is that `const` arrays/strings are compiled in .text segment in order of to be reached with `mov label[%rip], %rax` paradigm, which is the shortest machine code sequence in x64 to load a 64-bit constant. – Aki Suihkonen Mar 11 '13 at 10:16
  • 2
    Fine answer. I only wish it was completed it with some references (e.g. about data execution prevention). Just for the sake of completeness. The reddit thread could be useful. – emesx Mar 11 '13 at 16:05