1

I have two orthogonal questions related to symbol tables:

  1. Should I build the symbol table and perform type checking as I parse the code? Parsing first and then traversing the AST to build the symbol table looks cleaner to me. However, I like the idea of having an immutable AST (similar to Clang), and I can't have that in a two-step process (as I would need to insert extra type conversion nodes in the type checking phase).

  2. Should the symbol table be responsible for doing type checking? I read multiple articles in which symbol tables are used for this purpose. Is that a recommended practice? It looks rather awkward to me.

Note: I am using a top-down recursive descent parser.

Touloudou
  • 2,079
  • 1
  • 17
  • 28

1 Answers1

2

I believe this is what you should do:

First one: You should build your AST first, then, as you said traverse it to fill the symbol table up and do the type-checking. The immutable AST seems good, but it won't be as clean traversing the AST.

Second one: Yes, symbol tables should have a part in type-checking (not do type-checking). It will be needed to store the types of things like variables. There is nothing awkward about it :-)

xilpex
  • 3,097
  • 2
  • 14
  • 45
  • *The immutable AST seems good, but it won't be as clean traversing the AST*. Would you mind expanding on that? Right now, several of my AST nodes already have type information (e.g. variable declarations have a name + a type). This will be duplicated in the symbol table? Should other AST nodes (e.g. binary expressions) also have a type, that would be filled in during type-checking? (-> would require a mutable AST) – Touloudou May 03 '20 at 17:57
  • @Touloudou -- Ah, so that's what you are doing. In most of the compilers I wrote, I didn't get the type information immediately (as soon as the AST node is constructed). Instead, I let the nodes get constructed (*no* evaluations in the constructor), then, I ran the type checker over it; Finally, I called the methods which did the evaluating. It makes a very clean design. – xilpex May 03 '20 at 18:02
  • @Touloudou -- Also, yes-- binary ops should have a type (when ran through the type checker). You'll need that because when a binary op is assigned to a variable, you can easily get the type. – xilpex May 03 '20 at 18:04
  • I am not sure I follow. How can you not parse the types when constructing your AST? When you encounter a variable declaration, where do you keep the type information if not in the AST node? Ok for binary ops, thank you! – Touloudou May 03 '20 at 18:09
  • @Touloudou -- This is how you parse the types after the building of the AST: You have a type checking visitor, which visits all the AST nodes, adding to the type info to the symbol table, and checking all the other types. If you want to, you can also annotate the AST nodes, with their respective types. – xilpex May 03 '20 at 18:13
  • To make sure we are on the same page: first, I parse the types, but only for leaf nodes basically (the actual variable declarations). Then, in a second step, I will add these existing info in the symbol table and then deduce the types of non leaf nodes during type checking. Is that correct? – Touloudou May 03 '20 at 18:20
  • Yes, sort of (if I understood correctly). First, build the AST without any types. Then, run the type checker over the AST nodes, filling the symbol table (and sometimes annotating the AST nodes with their types). – xilpex May 03 '20 at 18:22
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/213037/discussion-between-touloudou-and-xilpex). – Touloudou May 03 '20 at 18:24