32

In the C programming language, all of the language revisions I have worked with enforced up-front variable declarations before any non-declarative/assignative expressions would be evaluated. C++ seems to have waived this requirement from all versions. I also recognize more modern version of C have waived this requirement as well, but I have yet to use any of those standards.

The question I have is this: What historic reason was there for preventing the C language from declaring freely on-demand instead of up front?

Obviously there are a number of reasons that come to mind from an engineering standpoint, but none of them seem especially plausible to me.

  1. Preventing an obscure compiler behavioral error from occurring (such as infinite parsing loops, a massive memory bloat for evaluation, or some weird corner cases with Macros.)
  2. Preventing undesirable compiler output. This could be anything from symbol output muddling the debug process and the ease of development of debugging tools, to unexpected stack storage orders.
  3. Readability. I find this hard to swallow as well, seeing as C, while designed for readability compared to other languages of the era, did not enforce this type of structure nearly anywhere else. (Unless you see prototyping as being a similar enforcement, but if I recall prototypes were added in the '89 spec.)
  4. Implementation complexity and practical reasons. This is the one I'm most inclined to believe. As engineers we have to make certain considerations in order to ship a viable product in a time-frame allotted. While I will grant that the professional landscape for Computer Science and Software Engineering have both changed dramatically, Business is still business. At the end of the day I'm sure Bell wanted a finished product that could be used in the Unix programming environment to showcase what they had achieved.

Does anyone have any good sources backing up any of the above? Did I miss something entirely? We can speculate from dawn till dusk, but I'm looking for good hard references.

Daniel Green
  • 634
  • 5
  • 14
  • 1
    I believe this may be a dupe. But I'm too lazy to look for it. – Mysticial Jan 14 '13 at 18:46
  • 1
    I don't think any technical reason. It's a pretty trivial code transformation to add a block that opens before any declaration that occurs after an expression-statement, and that extends to the end of the block in which the declaration occurs. AFAIK, by the time of C89 it's a pretty much arbitrary restriction because there wasn't (yet) enough support for the alternative. – Steve Jessop Jan 14 '13 at 18:47
  • 2
    Tagged `language-design` since you're asking about rationale behind some design decisions. – Kos Jan 14 '13 at 18:48
  • Right, I gave that reason under #4. But that isn't a fact, it's an opinion. – Daniel Green Jan 14 '13 at 18:51
  • Btw, it's worth noting that standard C has *never* required variables to be declared up-front in functions. In C89 they can be declared at the start of any block within a function. But you might find compilers that responded to this by moving the stack pointer every time a block is entered that has an initial declarative reason -- such compilers could conform to C89 whilst still making it "a bad idea" to actually use the facility. – Steve Jessop Jan 14 '13 at 18:53
  • Steve, I don't know if that is true or not, it would make sense however. Can someone confirm that a standards compliant compiler on the original 73' standard can do this? I don't even remember the 89' standard supporting this, but I could see it based on other 89' optimizations. – Daniel Green Jan 14 '13 at 18:55
  • @DanielGreen: Ah, for the purposes of my remark, the "original '73 standard" isn't a standard, it's just whatever Ritchie shipped. – Steve Jessop Jan 14 '13 at 18:56
  • @SteveJessop, I thought that ANSI ratified the 73 version of K&R C. Seems I am wrong. A quick trip down Wiki lane proved that. Learned something new. – Daniel Green Jan 14 '13 at 18:58
  • 2
    I doubt you'll find hard evidence for the reason. The place to look would be [Dennis Ritchie's home page](http://cm.bell-labs.com/who/dmr/). He has a section about C and its immediate ancestors that has a fair amount of early history of C. I've read through most of what's there, and don't recall seeing this mentioned. The early compiler code tends to agree with the simplicity idea though -- it used *seriously* tricky code to fit into available memory (e.g., the code generator reused the space occupied by the parser code to store data after parsing was finished). – Jerry Coffin Jan 14 '13 at 18:59
  • 1
    @DanielGreen this quote was already present in K&R 1st edition *"Declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function* – ouah Jan 14 '13 at 19:02
  • "More modern" versions? C99 has been around long enough to need to shave! – ecatmur Jan 14 '13 at 19:08
  • 1
    @ecatmur: "modern" generally means "more recently than I started doing it" ;-) At Oxford University (where I studied mathematics), "modern history" starts in the 15th century (or probably more like the 4th century if you ask a Classicist). Has done ever since the first Regius Professor of Modern History was appointed in 1724, always will do! – Steve Jessop Jan 14 '13 at 19:55
  • I'm far from an expert on this, but my guess was always that folks used to assembly programming liked being able to see all the stack variables in one place and mentally add up the function stack size. – aschepler Jan 14 '13 at 20:03

3 Answers3

19

Looking at the early (6th edition Unix, 1975) C manual from Dennis Ritchie's home page, in that version function-local variables could only be declared at the beginning of a function:

The function-statement is just a compound statement which may have declarations at the start.

function-statement: { declaration-listopt statement-list }

declaration-list is not defined (an omission), but can be readily assumed to have grammar:

declaration-list: declaration declaration-listopt.

No other compound statement is allowed to contain variable (or indeed any) declarations.

This obviously simplifies the implementation; in the early compiler source code c02.c the function header function blkhed() only needs to sum the stack space used by auto variable declarations, at the same time recording their stack offset, and emit code to bump the stack pointer by the appropriate amount. On function exit (by return or falling off the end) the implementation just needs to restore the saved stack pointer.

The fact that K&R feels necessary to state that "declarations of variables (including initializations) may follow the left brace that introduces any compound statement, not just the one that begins a function" is a hint that at that point it was a relatively recent feature. It also indicates that combined declaration-initialization syntax was also a recent feature, and indeed in the 1975 manual declarators cannot have initializers.

The 1975 manual in section 11.1 specifically states that:

C is not a block-structured language; this may fairly be considered a defect.

Block-statement and initialized declarations (K&R) address that defect, and mixed declarations and code (C99) are the logical continuation.

Pablo
  • 13,271
  • 4
  • 39
  • 59
ecatmur
  • 152,476
  • 27
  • 293
  • 366
10

In C89, variable definitions are required to be at the start of a block. (See the C standard for the definition of a block) This was as far as I know done to simplify the way the variables are handled in assembler. For example, let us have a look at a simple function:

void foo()
{
    int i = 5;
    printf("%i\n", i);
}

when the gcc translates this function into assembler code, the call to foo() would boil down to a bunch of instructions, including setting up a stackframe for the function scope. this stackframe includes space for the variables defined in the scope of the function, and to match the same scope in the higher level language C, they were required to be defined at the beginning of the block.

At the end, it was about ease of implementation, and also efficiency, because declaring a bunch of variables at once, that it, at the beginning of a block, enables the compiler to bulk-push them on the stack, and around ~89 that was also a performance consideration.

Of course, this answer is horribly simplified and aims only to give a brief idea about why this has been done the way it was done. For more Information, you should probably read some drafts of the early C89 standard.

Andreas Grapentin
  • 5,499
  • 4
  • 39
  • 57
10

A short answer that doesn't really answer much: C language initially inherited this declaration order restriction from its predecessor: B language. Why it was done that way in B language I, unfortunately, don't know.

Note also that in the nascent C (described in "C Reference Manual") it was illegal to initialize variables (even local ones) with non-constant expressions.

int a 5;
int b a; /* ERROR in nascent versions of C */

(a side note: in CRM initialization syntax did not include the = character). In general case this effectively negated the main benefit of in-code variable declarations: the ability to specify a meaningful run-time value as an initializer. Even in much more modern C89/90 this restriction still formally applied to aggregate initializers (although most compilers ignored it)

int a = 5, b = a;
struct { int x, y; } s = { a, b }; /* ERRROR even in C89/90 */ 

Only in C99 it became possible to use run-time values for all kinds of local initialization. This finally unlocked the full power of in-code variable declarations, so it is perfectly logical the C99 was the one to introduce them.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765