0

I need to finish my small-c to p-code compiler in a couple of weeks but am really having trouble understanding how to make my symbol table and subsequent code generation phase. Where can I start, I've seen a couple examples but I'm not getting the whole concept too well.

As you can see in the attached grammar, many re-writting rules are being used and the grammar is actually quite long, unfortunatelly took quite a long time to derive from the YACC grammar ommiting the parts we wouldn't need but at this point I don't know if later this will affect us when trying to come up with code-gen.

Any advices/hints are welcome, thanks.

1 Answers1

1

This is quite a broad question and difficult to answer as a whole. You should break down bigger tasks into smaller subtasks and ask questions about them here.

As a general idea: in your language you have rules that assign values to identifiers (LHS) and others that take identifiers for expressions, including simple assignments (RHS). These are the symbols you have to collect in your symbol table. There may be more symbols, like in type or variable definitions. You have all that in your syntax tree. You can ease your life by defining your grammar rules such that you have own rules for each identifier type (with own token types), like:

variable_name:
    identifier -> ^(VARIABLE_NAME identifier)
;

typedef_name:
    identifier -> ^(TYPEDEF_NAME identifier)
;

etc. This way you can easily identify the relevant tokens for for your symbol table. You then only have to walk over your syntax tree and pick up the text from the special tokens, which is a straight forward depth-first-search.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181