1

I have a bnf grammar:

{
    tokens = [
        COLON = ":"
        space=' '
        word = 'regexp:[^\r\n\s\t@\$\{\}\(\)\|\#:<>]+'
        nl = 'regexp:\r|\n|(\r\n)'
    ]
}

root ::= nlsp book_keyword COLON [space] book_title sections
book_keyword ::= 'Journal Book' | 'Fiction Book'
book_title ::= (! section (word | string) space?)+

sections ::= section+

section ::= nlsp section_keyword COLON [space] section_title {recoverWhile='sectionRecover'}
section_keyword ::= 'Section' | 'Content'
section_title ::= (!section (word | space | COLON))+

sectionRecover ::= !(nlsp| section_keyword)

nlsp ::= (NL| space)*

Text to test:

Fiction Book: Some Fiction
    Section: Chapter One
    Section: Chapter Two Section
    Content: Chapter Three

If I make an error in second or later element all be ok, but if in the first Sectio: Chapter One all psi tree will be broken.

enter image description here

1 Answers1

1

I see several problems: 1) you should have whitespace token. Something like this:

WHITESPACE="regexp:[ \n\r\t\f]"

as a consequence, you don't need space and npsp anymore

2) recoverWhile rule should be specified without quotes

3) sectionRecover matching whitespaces which most likely is incorrect

Argb32
  • 1,365
  • 8
  • 10
  • It is space care grammar, so I've disabled whitespace feature. BTW, it is not fixes scope mess. As I can see the first section element is covered only by root level and not entered in sections anymore. This behavior is valid for only 1st element in sections. – Viktor Sidochenko Nov 21 '18 at 08:33