Your tutorial is not exactly wrong, but it describes the behavior of a very strict implementation, stricter than the C standard requires implementations to be.
What the C standard says about the function that is called on program start is just that certain possibilities are required to work. It doesn't say that other things are required not to work. Also, the C standard says very little about compiler errors. It is common for modern C implementations to go well beyond the standard's requirements for diagnostics; it is also common for them to support many extensions to the set of programs they will accept.
In a "hosted" environment, one that provides all of the facilities of the standard C library, programs that give the function called on program start the name and signature int main(void)
or int main(int argc, char **argv)
are conforming. They are required to work. But the standard permits this function to be declared "in some other implementation-defined manner" as well, and many alternative names and signatures exist: I'm just going to list a few of the most common ones.
int main(int argc, char **argv, char **envp)
void main(void)
int WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, LPSTR lpCmdLine, int nCmdShow)
If your implementation documents that it supports one of these alternative entry point names or signatures, it's perfectly OK to use it. (Your program will not be "strictly conforming", but almost no real programs are "strictly conforming", so don't worry about it.)
In a "freestanding" environment, which doesn't provide all of the standard C library, the name and signature of the function called on program start are left up to the implementation—you might have to use something wacky like EFI_STATUS efi_main (EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
.
However, it is common for int main(void)
to work, and somewhat less common for int main(int argc, char **argv)
to work (the arguments received at runtime may be garbage).
Now, what happens if you have a hosted environment and you use an entry-point function name and/or signature that isn't documented to work? The C standard says that your program has undefined behavior in this case—anything at all is allowed to happen. Some common things that will actually happen are:
The compiler does issue an error, or at least a warning. It's not getting the correct prototype from any header file when it does this; rather, the correct prototype is said to be built in to the compiler, defined in the compiler's own source code. Nowadays this is the case for many C library functions as well as main
. A demonstration:
$ cat > test.c <<\!
extern int exit(int); // wrong, `exit` should return `void`
void main(void) {} // wrong, `main` should return `int`
!
$ gcc -fsyntax-only -std=gnu11 -Wall test.c
test.c:1:12: warning: conflicting types for built-in function ‘exit’
test.c:2:6: warning: return type of ‘main’ is not ‘int’
(For historical reasons, GCC is not nearly as picky about people's code as it could be; many of the things that, from a modern perspective, should be errors are merely warnings, and not even warnings that are on by default. If you're writing new code from scratch and you use GCC, I recommend using -std=gnu11 -Wall -Wextra -Wpedantic -Werror
options basically always. Not -std=c11
, though, because that turns off extensions that you may need, and can also expose bugs in the system headers.)
The program fails to link. This is what happens, for instance, if you try to make up your own name, instead of calling it main
:
$ cat > test.c <<\!
extern int puts(const char *);
void my_program_starts_here(void) { puts("hello world"); }
!
$ gcc -std=gnu11 -Wall test.c
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o:
In function `_start':
(.text+0x20): undefined reference to `main'
This is one of the more cryptic errors you can get out of the linker, so I'll unpack it a little. Have you ever wondered how main
gets called? It's surprisingly simple: there's a function provided by the C library, conventionally called _start
, whose last line is something like
exit(main(argc, argv, environ));
For historical reasons, this function isn't bundled with the bulk of the C library in libc.so
. It's in a separate object file, crt1.o
, that the compiler automatically pulls in when asked to link a program (just like it automatically tacks on an -lc
). Thus, when you don't define main
, the reference to main
from _start
is unsatisfied and the link fails.
(OK, how does _start
get called? That's where you get into deeper magic. Ask another question.)
Finally, the program may compile and link fine and even appear to work correctly — but look harder and you discover that it is misbehaving. This is what happens if you use void main(void)
on a Unix system. (To first order, all hosted environments other than Windows are Unix systems nowadays.)
$ cat > test.c <<\!
extern int puts(const char *);
void main(void) { puts("hello world"); }
!
$ gcc -std=gnu11 test.c
$ ./a.out
hello world
Without -Wall
, not a peep from the compiler, and the program ran fine...or did it?
$ ./a.out ; echo $?
hello world
12
The value that's supposed to be returned from main
becomes the program's exit status, which appears in the shell variable $?
. If main
had been properly declared to return int
, and there had been a return 0;
at the end, the echo $?
would have printed 0. Where did 12 come from? Probably it was the return value of puts
, which the compiler did not bother clearing out of the return-value register before returning from main
.
It's easy to not notice this bug, but it is a bug and the first person who tries to write a shell script that involves your program will be annoyed with you.
Some footnotes about the exit status, mainly for pedants:
In C++, and in C starting with the 1999 standard, you are technically allowed to omit any explicit return 0;
at the end of main
as long as you declare it correctly, but I think relying on this is poor style.
On many but not all implementations of Unix, the value that shows up in $?
will be only the low seven or eight bits of the value returned from main
. This is a limitation in the system call used to retrieve the exit status of a child process, waitpid
.
A strictly conforming ISO C program can only return three values from main
: 0, EXIT_SUCCESS
, and EXIT_FAILURE
; the latter two constants are declared in stdlib.h
. The effect of returning zero from main
is guaranteed to be the same as the effect of returning EXIT_SUCCESS
, but the values are not guaranteed to be equal.
In practice, it is safe to return at least 0, 1, and 2, and the implementations where EXIT_SUCCESS != 0
and/or EXIT_FAILURE != 1
have long since gone to the great bit bucket in the sky, so don't worry about it.