Multiline comments in a recursive descent parser

Question

I'm trying to wrap my head around how to handle C-style multiline comments (/* */) with a recursive descent parser. Because these comments can appear anywhere, how do you account for them? For example, suppose you're parsing a sentence into word tokens, what do we do if there's a comment inside a word?

Ex.

This is a sentence = word word word word

vs

This is a sen/*sible*/tence = ???

Thanks!

Did you write a lexer/tokenizer first? You could just ignore anything between `/*` and `*/` when breaking your program text into tokens. — eigenchris, Mar 06 '15 at 03:29

score 1 · Accepted Answer · answered Mar 06 '15 at 04:08

1

In C, like pretty well every other programming language, a comment is effectively whitespace; a comment cannot occur within a token.

So comments cannot interrupt the parsing of a token, and thus only need to be recognized and ignored.

answered Mar 06 '15 at 04:08

rici

234,347
28
237
341

So if I want to still keep track of comments and where they are in the text, should I do two passes through the text? One ignoring comments, and the other only looking for comments? – John Wonderick Mar 06 '15 at 05:02
@JohnWonderick You can keep a separate list of where comments are without a second pass. But comments really are irrelevant to parsing. If you are trying to build a pretty-printer or some such, you might create a linked list/vector of tokens as you tokenize, but do the parse itself only with meaningful tokens. – rici Mar 06 '15 at 05:34

Multiline comments in a recursive descent parser

1 Answers1