1

I am writing parser of language similar to javascript with its semicolon insertion ex:

var x = 1 + 2;
x;

and

var x = 1 + 2

x

and even

var x = 1 +
2
x

are the same.

For now my lexer matches newline (\n) only when it occurs after token different that semicolon. That plays nice with basic situations like 1 and 2 but how i can deal with third situation? i.e. new line happening in the middle of expression. I can't match new line every time because it would pollute my parser (inserting alternatives with newlines token everywhere) and I also cannot match them at all because it is statement terminator. Basically I would be the best to somehow check during parsing end of the statement if there was a new line character or semicolon there.

Timofei Davydik
  • 7,244
  • 7
  • 33
  • 59
Krzysztof Kaczor
  • 5,408
  • 7
  • 39
  • 47
  • JavaScript's ASI only inserts a semicolon when it encounters a syntax error, which is different to having an optional semicolon. It would probably be better to handle this in the parser and treat line terminators as significant tokens. – Qantas 94 Heavy Aug 31 '14 at 11:34
  • but then my parser rule for addition would look like: ` addition : expr PLUS expr | expr NEWLINE PLUS EXPR ...` So it is a no go. Am I missing something? – Krzysztof Kaczor Aug 31 '14 at 17:06
  • Possible duplicate of [Parsing optional semicolon at statement end](https://stackoverflow.com/questions/10970699/parsing-optional-semicolon-at-statement-end) – chharvey Jun 24 '19 at 10:19

1 Answers1

0

This has gone unanswered for a while. I cannot see why you cannot make a statement separator a newline **or* a semicolon. A bit like this:

whitespace    [ \t]+
%%
{whitespace}    /* Skip */
;[\n]*        return(SEMICOLON);
[\n]+         return(SEMICOLON);

Then you're grammar is not messed up at all, as you only get the semicolon in the grammar.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129