I am writing a front-end for a language (by ocamllex
and ocamlyacc
).
So the frond-end can build a Abstract Syntax Tree (AST)
from a program. Then we often write a pretty printer, which takes an AST and print a program. If later we just want to compile or analyse the AST, most of the time, we don't need the printed program to be exactly the same as the original program, in terms of white-spacing. However, this time, I want to write a pretty printer that prints exactly the same program as the original one, in terms of white-spacing.
Therefore, my question is what are best practices to handle white-spacing while trying not to modify too much the types of AST. I really don't want to add a number (of white-spaces) to each type in the AST.
For example, this is how I currently deal with (ie, skip) white-spacing in lexer.mll
:
rule token = parse
...
| [' ' '\t'] { token lexbuf } (* skip blanks *)
| eof { EOF }
Does anyone know how to change this as well as other parts of the front-end to correctly taking white-spacing into account for a later printing?