6

I've few questions about C++ compilers

  • Are C++ compilers required to be one-pass compiler? Does the Standard talk about it anywhere?

  • In particular, is GCC one-pass compiler? If it is, then why does it generate the following error twice in this example (though the template argument is different in each error message)?

    error: declaration of ‘adder<T> item’ shadows a parameter
    error: declaration of ‘adder<char [21]> item’ shadows a parameter

A more general question


Useful links:

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • This might interest you: http://en.wikipedia.org/wiki/Comeau_C/C%2B%2B. I think it uses a multipass approach to be able to support the export keyword for templates. – Gabriel Schreiber Mar 16 '11 at 09:21

5 Answers5

5

The standard sets no requirements what so ever with regards to how a compiler is implemented. But what do you mean by "one-pass"? Most compilers today do only read the input file once. They create an in memory representation (often in the form of some sort of parse tree), and may make multiple passes over that. And almost certainly make multiple passes over parts of it. The compiler must make a "pass" over the internal representation of a template each time it is instantiated, for example; there's no way of avoiding that. G++ also makes a "pass" over the template when it is defined, before any instantiation, and reports some errors then. (The standard committee expressedly designed templates to allow a maximum of error detection at the point of definition. This is the motivation behind the requirement for typename in certain places, for example.) Even without templates, a compiler will generally have to make two passes over a class definition if there are functions defined in it.

With regards to the more general question, again, I think you'd have to define exactly what you mean by "one-pass". I don't know of any compiler today which reads the source file several times, but almost all will visit some or all of the nodes in the parse tree more than once. Is this one-pass or multi-pass? The distinction was more significant in the past, when memory wasn't sufficient to maintain much of the source code in an internal representation. Languages like Pascal and, to a lesser degree C, were sometimes designed to be easy to implement with a single pass compiler, since a single pass compiler would be significantly faster. Today, this issue is largely irrelevant, and modern languages, including C++, tend to ignore it; where C++ seems to conform to the needs of a one-pass compiler, it's largely for reasons of C compatibility, and where C compatibility is not an issue (e.g. in a class definition), it often makes order of declaration irrelevant.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • You keep asking what "one pass" or "multi pass" is, yet these are well defined terms. Links are provided in the original question and no additional information is needed. Obviously, multi-pass does not require you to parse the source code several times, but it requires to hold whole program in some kind of form and then restart the transformation from the beginning. One-pass forbits that. One pass does not allow, for example, to use variables/functions which are declared only later in the code. But in C++, in class scope, it permits that. – CygnusX1 Mar 16 '11 at 16:29
  • 1
    Are they? The Wikipedia contradicts itself in the first paragraph. Historically, one-pass meant one pass over the source, but in a context where the compiler couldn't save much more than the symbol table in memory. So a program compiled with a one-pass compiler couldn't refer to variables declared later in the code. C++ allows using variables which are first declared later in certain cases. But all of the compilers I know will only read the source code once. – James Kanze Mar 16 '11 at 19:25
  • As I said: "multi-pass does not require you to parse the source code several times, but it requires to hold whole program in some kind of form and then restart the transformation from the beginning." On the contrary, single-pass does not require you to hold whole program in memory in whatever form. – CygnusX1 Mar 16 '11 at 19:58
  • 1
    Which doesn't get us very far? What do you mean by "whole program"? In particular, C++ requires the compiler to keep templates and class definitions available, but once a function has been parsed, the memory image for it can be thrown out. And it's possible to parse an ordinary function by just visiting the tree once (although typically, small parts of the tree may be visited more than once to resolve ambiguities). – James Kanze Mar 17 '11 at 16:20
  • Agreed... the terms "one-pass" or "multi-pass" are artifacts from an older era of computing, but old terms like these stick around on Wikipedia long after they lose their importance in textbooks. – Dietrich Epp Mar 24 '16 at 08:42
3

From what I know, 30 years ago it was important for a compiler to be one-pass, because reads and writes to disk (or magnetic tape) were very slow and there was not enough memory to hold whole code (thanks James Kanze). Also, a single-pass is a requirement for scripting/interactive languages.

Nowdays compilers are usually not one-pass, there are several intermediate representations (e.g Abstract Syntax Tree or Static Single Assignment Form) that the code is transformed into and then analised/optimised.

Some elements in C++ cannot be solved without some intermediate steps, e.g. in a class you can reference members which are defined only later in the class body. Also, all templates need to be somehow remembered for further access during instantiation.

What does not happen usually, is that the source code is not parsed several times --- there is no need for that. So you should not experience same syntactic error being reported several times.

CygnusX1
  • 20,968
  • 5
  • 65
  • 109
  • 3
    The reason single pass was important in the past was because main memory wasn't large enough to hold a complete internal representation of a function; if the compiler wasn't single pass, it would have to reread the data from disk. Or mag tape, which was still largely used then. – James Kanze Mar 16 '11 at 09:35
2
  1. No, I would be surprised if you found a heavily used C++ single pass compiler.
  2. No, it does multiple passes and even different optimizations based on the flags you pass it.

Advantages (single-pass): fast! Since all the source only needs to be examined once the compilation phase (and thus beginning of execution) can happen very quickly. It is also a model that is attractive because it makes the compiler easy to understand and often times "easier" to implement. (I worked on a single pass Pascal compiler once, but don't encounter them often, whereas single pass interpreters are common)

Disadvantages (sinlge-pass): Optimization, semantic/syntactic analysis. Sometimes a single code look lets things through that are easily caught by simple mechanisms in multiple passes. (kind of why we have things like JSLint)

Advantages (multi-pass): optimizations, semantic/syntactic analysis. Even pseudo interpreted languages like "JRuby" go through a pipeline compilation process to get to java/jvm bytecode before execution, you could consider this multi-pass and the multiple looks at the varying representations (and consequently the resulting optimizations) of code can make it very fast.

Disadvantages (multi-pass): complexity, sometimes time (depending on if AOT/JIT is being used as your compilation method)

Also, single-pass is pretty common in academia to help learn the aspects of compiler design.

Brandon
  • 2,574
  • 19
  • 17
2

Walter Bright, the developer of the first C++ compiler, has stated that he believes it is not possible to compile C++ without at least 3 passes. And, yes, that means 3 full text-transforming passes over the source, not just traversals through an internal tree representation. See his Dr. Dobb's magazine article, "Why is C++ compilation so slow?" So any hope of finding a true one-pass compiler seems doomed. (I think this was part of the motivation Bright had to develop D, his C++ alternative.)

librik
  • 3,738
  • 1
  • 19
  • 20
1

The compiler only needs to look at the sources once top down, but that does not mean that it does not have to process the parsed contents more than once. In particular with templates, it has to instantiate the templated code with the type, and that cannot happen until the template is used (or explicitly instantiated by the user), which is the reason for your duplicate errors:

When the template is defined, the compiler detects an error and at that point the type has not been substituted. When the actual instantiation occurs it substitutes the template arguments and processes the result, which is what triggers the second error. Note that if the template was specialized after the first definition, and before the instantiation, for that particular type, the second error need not occur.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489