I am writing parser of language similar to javascript with its semicolon insertion ex:
var x = 1 + 2;
x;
and
var x = 1 + 2
x
and even
var x = 1 +
2
x
are the same.
For now my lexer matches newline (\n) only when it occurs after token different that semicolon. That plays nice with basic situations like 1 and 2 but how i can deal with third situation? i.e. new line happening in the middle of expression. I can't match new line every time because it would pollute my parser (inserting alternatives with newlines token everywhere) and I also cannot match them at all because it is statement terminator. Basically I would be the best to somehow check during parsing end of the statement if there was a new line character or semicolon there.