5

For example:

#include <stdio.h> 
int main (void)                         /* Why int and not short int? - Waste of Memory */ 
{
     printf("Hello World!");
     return 0; 
}

Why main() is conventional defined with int type, which allocates 4 bytes in memory on 32-bit, if it usually returns only 0 or 1, while other types such as short int (2 bytes,32-bit) or even char (1 byte,32-bit) would be more memory saving?

It is wasting memory space.

NOTE: The question is not a duplicate of the thread given; its answers only correspond to the return value itself but not its datatype at explicit focus.

The Question is for C and C++. If the answers between those alter, share your wisdom with the mention of the context of which language in particular is focused.

  • 1
    @Ana, Considering your best shot with `void` is a non-portable program while `int` is guaranteed to work, I'd argue the accuracy of that statement. Refer to https://stackoverflow.com/questions/204476/what-should-main-return-in-c-and-c?rq=1 – chris Oct 11 '19 at 09:02
  • @Ana Thank you for your comments first. I don´t think it is a duplicate of the thread given, because the answers on that thread only respond to the value of return itself, not its datatype at explicit focus. – RobertS supports Monica Cellio Oct 11 '19 at 09:03
  • 1
    There is no requirement that `main()` return only 0 or 1. The values that can be returned are `0`, `EXIT_SUCCESS`, and `EXIT_FAILURE` where the value of `EXIT_SUCCESS` and `EXIT_FAILURE` are implementation-defined (and there is nothing preventing `EXIT_SUCCESS` being zero, either). Other values can be returned, but the behaviour is implementation defined. – Peter Oct 11 '19 at 09:10
  • 2
    Err, if it's really that much of a worry, you can get back those pesky "wasted" 2 bytes by shortening your string to `"Hi, World!"`. – paddy Oct 11 '19 at 09:33
  • Possible duplicate of [What "standard" application return/exit codes should an application support?](https://stackoverflow.com/questions/1538884/what-standard-application-return-exit-codes-should-an-application-support) – Kamiccolo Oct 11 '19 at 10:52
  • The return value is usually passed in a machine register, so you could consider using less of that register to be the wasteful option. – molbdnilo Oct 11 '19 at 10:59
  • 1
    See [Exitcodes bigger than 255 — possible?](https://stackoverflow.com/questions/179565/exitcodes-bigger-than-255-possible) for some relevant information (though there's more that could/should be said). – Jonathan Leffler Oct 11 '19 at 12:09
  • The memory wasted is essentially non-existent — the information is stored in a data structure in the kernel for the dead process so that it can be collected by _another process_ via `wait()` and its friends. When Unix was first written, `int` was a 16-bit type; there was no 'waste' because there was 16 bits of status information (exit code in 8 bits, signal number and core dump or not in another 8 bits). Worrying about 2 bytes 'wasted' in these days of gigabytes of main memory is pointless. In an embedded environment, it might be more significant, but you'd be hard-pressed to show it. – Jonathan Leffler Oct 11 '19 at 12:13
  • @Peter -- re: implementation-defined -- in the standard, "implementation defined" has a specific meaning: a conforming compiler is required to document its behavior. A better term for the values of `EXIT_SUCCESS`, `EXIT_FAILURE`, etc. would be "implementation specific". – Pete Becker Oct 11 '19 at 13:30
  • Scrap your 8051 and get a controller from this decade:) – Martin James Oct 11 '19 at 15:06
  • @PeteBecker - I suppose so. The wording from the 1989 C standard Section 7.20.4.3, para 5 for the `exit()` function is "If the value of `status` is zero or `EXIT_SUCCESS`, an implementation-defined form of the status *successful termination* is returned. If the value of `status` is `EXIT_FAILURE`, an implementation-defined form of the status *unsuccessful termination* is returned. Otherwise the status returned is implementation-defined." Which makes my teeth itch a little but leaves the actual values implementation-specific, as you say, but the resultant behaviour is implementation defined. – Peter Oct 11 '19 at 20:40
  • @Peter — whoops, I should have looked that up instead of assuming the values were unspecified. You’re right that they’re implementation defined. The whole point of this stuff is to be able to check the result in a shell script, and the actual value might matter. – Pete Becker Oct 11 '19 at 21:21
  • Who says that `main()` has to return only 0 or 1? This is exit value - it can be any `int` value. – i486 Mar 15 '20 at 11:40
  • @i486 Yeah, that was a confusion/mistake of mine, quite a while ago because I didn´t got the information about it which has changed through the helpful community - I apologize. But if I would change the title now, many answers and even a decent part of the question itself would get altered. – RobertS supports Monica Cellio Mar 15 '20 at 11:46
  • 1
    @RobertSsupportsMonicaCellio Note that return codes of `main()` are used by specific command shell / OS which can be very different. From DOS/Windows prompt, to Linux bash, sh, csh, etc. OS/2, etc. In all cases there is `main()` with exit code. – i486 Mar 15 '20 at 11:48

7 Answers7

9

Usually assemblers use their registers to return a value (for example the register AX in Intel processors). The type int corresponds to the machine word That is, it is not required to convert, for example, a byte that corresponds to the type char to the machine word.

And in fact, main can return any integer value.

machine_1
  • 4,266
  • 2
  • 21
  • 42
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • Note: returning other than 0, `EXIT_SUCCESS` or `EXIT_FAILURE` from `main` is not standard (it is implementation defined) – David Ranieri Oct 11 '19 at 09:33
  • 1
    @DavidRanieri The Standard says nothing about what return values are standard. – Vlad from Moscow Oct 11 '19 at 09:37
  • 3
    @DavidRanieri It is an incorrect interpretation. Simply the Standard introduces two macros to simplify writing the code. It does not mean that these values are standard and others not. In fact all values are implementation defined. – Vlad from Moscow Oct 11 '19 at 09:43
  • all values are implementation defined except 0 (meaning "success"), EXIT_SUCESS and EXIT_FAILURE, thats just what I say ;) – David Ranieri Oct 11 '19 at 09:53
  • 1
    @DavidRanieri Not true. **All** values are "implementation-defined". The [C standard says](https://port70.net/~nsz/c/c11/n1570.html#7.22.4.4p5): "If the value of status is zero or EXIT_SUCCESS, an implementation-defined form of the status successful termination is returned. If the value of status is EXIT_FAILURE, an implementation-defined form of the status unsuccessful termination is returned. Otherwise the status returned is implementation-defined." There's no way to get a non-implementation-defined value. – Andrew Henle Oct 11 '19 at 14:24
  • 1
    @AndrewHenle excuse me but I don't get you, if only 0, EXIT_SUCCESS and EXIT_FAILURE are defined by the standard, then the rest of values are "implementation defined", isn' it? and that's just what I'm saying, maybe my english is too bad :( – David Ranieri Oct 11 '19 at 15:31
6

It's because of a machine that's half a century old.

Back in the day when C was created, an int was a machine word on the PDP-11 - sixteen bits - and it was natural and efficient to have main return that.

The "machine word" was the only type in the B language, which Ritchie and Thompson had developed earlier, and which C grew out of.
When C added types, not specifying one gave you a machine word - an int.
(It was very important at the time to save space, so not requiring the most common type to be spelled out was a Very Good Thing.)

So, since a B program started with

main()

and programmers are generally language-conservative, C did the same and returned an int.

molbdnilo
  • 64,751
  • 3
  • 43
  • 82
5

There are two reasons I would not consider this a waste:

1 practical use of 4 byte exit code

If you want to return an exit code that exactly describes an error you want more than 8 bit.

As an example you may want to group errors: the first byte could describe the vague type of error, the second byte could describe the function that caused the error, the third byte could give information about the cause of the error and the fourth byte describes additional debug information.

2 Padding

If you pass a single short or char they will still be aligned to fit into a machine word, which is often 4 Byte/32 bit depending on architecture. This is called padding and means, that you will most likely still need 32 bit of memory to return a single short or char.

Community
  • 1
  • 1
Lukas-T
  • 11,133
  • 3
  • 20
  • 30
  • 1. That´s a very fantastic information ! I might use this in the future. How i can use those "informative" bytes back if it is a main() program? Isn´t the OS get that error code values or should the error code then be sent to any program else? – RobertS supports Monica Cellio Oct 11 '19 at 09:22
  • Depends on how you call the program. For example in powershell there exists a variable $LASTEXITCODE. – Lukas-T Oct 11 '19 at 09:36
  • Regarding padding. Your wording is misunderstoodable and strictly spoken wrong: not the instance of the type is padded. The location of variables is aligned to to the next multiple it's size up to the word size. So if you have a single byte and then a int32_t. There will be 3 padding bytes between the two. If the second one is short them there will be one padding bytes – vlad_tepesch Oct 11 '19 at 11:37
  • @vlad_tepesch yes, I understand what you mean, thats why emphasised the _single_. I will edit it to make it clearer, thanks. – Lukas-T Oct 12 '19 at 10:54
3

The old-fashioned convention with most shells is to use the least significant 8 bits of int, not just 0 or 1. 16 bits is increasingly common due to that being the minimum size of an int allowed by the standard.

And what would the issue be with wasting space? Is the space really wasted? Is your computer so full of "stuff" that the remaining sizeof(int) * CHAR_BIT - 8 would make a difference? Could the architecture exploit that and use those remaining bits for something else? I very much doubt it.

So I wouldn't say the memory is at all wasted since you get it back from the operating system when the program finishes. Perhaps extravagent? A bit like using a large wine glass for a small tipple perhaps?

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • 1. "The old-fashioned convention with most shells is to use the least significant 8 bits of `int`, not just 0 or 1." - Can you explain that a little more? What happens with a return value of 0 in that case exactly? 2. It is more like a theoretical thinking about saving memory, where memory could be saved, if possible. I totally know RAMs have enough capacity to do not need to take care of that few bits and it would be kind of ridiculous to in that context. But it is more a theoretical concern about why to implement `int`, when another datatype might "superficial" be more efficent. – RobertS supports Monica Cellio Oct 11 '19 at 09:35
  • @RobertS: It may well be more memory efficient but it certainly would not be more time-efficient – Bathsheba Oct 11 '19 at 09:37
2

1st: Alone your assumption/statement if it usually returns only 0 or 1 is wrong.

Usually the return code is expected to be 0 if no error occurred but otherwise it can return any number to represent different errors. And most (at least command line programs) do so. Many programs also output negative numbers.

However there are a few common used codes https://www.tldp.org/LDP/abs/html/exitcodes.html also here another SO member points to a unix header that contains some codes https://stackoverflow.com/a/24121322/2331592

So after all it is not just a C or C++ type thing but also has historical reasons how most operating systems work and expect the programs to behave and since that the languages have to support that and so at least C like languages do that by using an int main(...).

2nd: your conclusion It is wasting memory space is wrong.

  1. Using an int in comparison to a shorter type does not involve any waste. Memory is usually handled in word-size (that that mean may depend from your architecture) anyway
  2. working with sub-word-types involves computation overheand on some architecture (read: load, word, mask out unrelated bits; store: load memory, mask out variable bits, or them with the new value, write the word back)
  3. the memory is not wasted unless you use it. if you write return 0; no memory is ever used at this point. if you return myMemorySaving8bitVar; you only have 1 byte used (most probable on the stack (if not optimized out at all))
vlad_tepesch
  • 6,681
  • 1
  • 38
  • 80
  • "otherwise it can return any number to represent different errors". Only a single byte, so not any number (nor any int). Only 0 to 255, or -128 to 127 if you treat it as signed. – nog642 Mar 31 '21 at 01:24
1

The answer is "because it usually doesn't return only 0 or 1." I found this thread from software engineering community that at least partially answers your question. Here are the two highlights, first from the accepted answer:

An integer gives more room than a byte for reporting the error. It can be enumerated (return of 1 means XYZ, return of 2 means ABC, return of 3, means DEF, etc..) or used as flags (0x0001 means this failed, 0x0002 means that failed, 0x0003 means both this and that failed). Limiting this to just a byte could easily run out of flags (only 8), so the decision was probably to use an integer.

An interesting point is also raised by Keith Thompson:

For example, in the dialect of C used in the Plan 9 operating system main is normally declared as a void function, but the exit status is returned to the calling environment by passing a string pointer to the exits() function. The empty string denotes success, and any non-empty string denotes some kind of failure. This could have been implemented by having main return a char* result.

Here's another interesting bit from a unix.com forum:

(Some of the following may be x86 specific.)

Returning to the original question: Where is the exit status stored? Inside the kernel.

When you call exit(n), the least significant 8 bits of the integer n are written to a cpu register. The kernel system call implementation will then copy it to a process-related data structure.

What if your code doesn't call exit()? The c runtime library responsible for invoking main() will call exit() (or some variant thereof) on your behalf. The return value of main(), which is passed to the c runtime in a register, is used as the argument to the exit() call.

Related to the last quote, here's another from cppreference.com

5) Execution of the return (or the implicit return upon reaching the end of main) is equivalent to first leaving the function normally (which destroys the objects with automatic storage duration) and then calling std::exit with the same argument as the argument of the return. (std::exit then destroys static objects and terminates the program)

Lastly, I found this really cool example here (although the author of the post is wrong in saying that the result returned is the returned value modulo 512). After compiling and executing the following:

int main() {
    return 42001;
}

on a POSIX compliant my* system, echo $? returns 17. That is because 42001 % 256 == 17 which shows that 8 bits of data are actually used. With that in mind, choosing int ensures that enough storage is available for passing the program's exit status information, because, as per this answer, compliance to the C++ standard guarantees that size of int (in bits)

can't be less than 8. That's because it must be large enough to hold "the eight-bit code units of the Unicode UTF-8 encoding form."

EDIT:

*As Andrew Henle pointed out in the comment:

A fully POSIX compliant system makes the entire int return value available, not just 8 bits. See pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html: "If si_code is equal to CLD_EXITED, then si_status holds the exit value of the process; otherwise, it is equal to the signal that caused the process to change state. The exit value in si_status shall be equal to the full exit value (that is, the value passed to _exit(), _Exit(), or exit(), or returned from main()); it shall not be limited to the least significant eight bits of the value."

I think this makes for an even stronger argument for the use of int over data types of smaller sizes.

gstukelj
  • 2,291
  • 1
  • 7
  • 20
  • 1
    A fully POSIX compliant system makes the entire `int` return value available, not just 8 bits. See https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html: "If `si_code` is equal to `CLD_EXITED`, then `si_status` holds the exit value of the process; otherwise, it is equal to the signal that caused the process to change state. The exit value in `si_status` shall be equal to the full exit value (that is, the value passed to `_exit()`, `_Exit()`, or `exit()`, or returned from `main()`); it shall not be limited to the least significant eight bits of the value." – Andrew Henle Oct 11 '19 at 14:37
  • That's a great resource, thanks @AndrewHenle! Will update my answer. – gstukelj Oct 11 '19 at 15:10
1

You're either working in or learning C, so I think it's a Real Good Idea that you are concerned with efficiency. However, it seems that there are a few things that seem to need clarifying here.

First, the int data type is not an never was intended to mean "32 bits". The idea was that int would be the most natural binary integer type on the target machine--usually the size of a register.

Second, the return value from main() is meant to accommodate a wide range of implementations on different operating systems. A POSIX system uses an unsigned 8-bit return code. Windows uses 32-bits that are interpreted by the CMD shell as 2's complement signed. Another OS might choose something else.

And finally, if you're worried about memory "waste", that's an implementation issue that isn't even an issue in this case. Return codes from main are typically returned in machine registers, not in memory, so there is no cost or savings involved. Even if there were, saving 2 bytes in the run of a nontrivial program is not worth any developer's time.

Mike Housky
  • 3,959
  • 1
  • 17
  • 31