I've been studying the grammar and AST nodes for various languages with AST explorer. With python, I've noticed that some form of semantic analysis is taking place during the parsing process. E.g.
x = 2
x = 2
Yields the following AST consisting of a VariableDeclaration
node and ExpressionStatement
node.
So when the first x = 2
line is parsed, it checks a symbol table for the existence of x
and then registers it and produces a VariableDeclaration
node. Then when the second x = 2
line is parsed, it finds that x
is already defined and produces a ExpressionStatement
node.
However when I attempt to use the following semantically incorrect code:
2 + "string"
It accepts the code, and produces a ExpressionStatement
node - even though it is semantically incorrect i.e. int + string
, and rightly so produces an error when I attempt to execute it with a python interpreter.
This suggests to me that semantic analysis takes place twice: once during the parsing process and once again while traversing the complete AST. Is this assumption correct? If so why is this the case? Wouldn't it be more simple to do the entire semantic analysis phase during parsing instead of splitting it up?