1

What are the symbol table and AST needed for during code compilation?

I'm trying to a get a basic high level understanding of the code compilation process.

I understand the basic steps to be:

Lexical analysis
Syntax analysis
Semantic analysis
Code generation
Code optimisation
Linking

As I understand it the symbol table starts to get built during the lexical analysis step as the code is lexed. This would include token type and the actual tokens identified. During later steps additional info is added to the symbol table such as scope and data type. If I understand correctly during syntax analysis the AST is built which represents the structure of the code and is annotated with the same information as in the symbol table.

I'm confused as to why both symbol table and AST are needed. Is one used to build the other? Are both of them fed into the code generation step? Is this language dependent (compiled vs interpreted)?

Where does white-space removal take place? I've been told this is during lexical analysis but if that's the case my var = 5 would get converted to myvar=5 which is now syntactically correct.

Thanks for your input.

  • 2
    The AST and the symbol table *are not* the same. The AST typically contains rich enough information about the overall structure for later steps whereas the symbol table is an associative structure containing information about the symbols found/produced throughout. – Frank C. Oct 25 '18 at 14:57

0 Answers0