What are the symbol table and AST needed for during code compilation?
I'm trying to a get a basic high level understanding of the code compilation process.
I understand the basic steps to be:
Lexical analysis
Syntax analysis
Semantic analysis
Code generation
Code optimisation
Linking
As I understand it the symbol table starts to get built during the lexical analysis step as the code is lexed. This would include token type and the actual tokens identified. During later steps additional info is added to the symbol table such as scope and data type. If I understand correctly during syntax analysis the AST is built which represents the structure of the code and is annotated with the same information as in the symbol table.
I'm confused as to why both symbol table and AST are needed. Is one used to build the other? Are both of them fed into the code generation step? Is this language dependent (compiled vs interpreted)?
Where does white-space removal take place? I've been told this is during lexical analysis but if that's the case my var = 5
would get converted to myvar=5
which is now syntactically correct.
Thanks for your input.