The regex [a-zA-Z-]+::
matches strings like one-name::
.
I would like a regex that will do the opposite. Is there an "automated" way to build such a regex?
I can't just check the failure of the first regex, as I have to use a "negated" regex with lark
.
Context of use
Indeed, my question is related to the code below that doesn't work because line
must also reject blockname
.
TEXT = """
// Comment #1
abrev::
ok = ok
=========
One title
=========
Bla, bla
Bli, Bli
verbatim::
OK, or not ok ?
That is...
...the question.
Blu, Blu
==
10
==
// Comment #2
Blo, blo
==
05
==
"""
from lark import Lark
GRAMMAR = r"""
?start: _NL* (heading | comments | block)*
heading : ruler _NL title _NL ruler _NL+ (block | comments | paragraph)*
ruler : /={2,}/
title : /[^\n={2}\/{2}]+/
comments : "//" cline _NL*
paragraph : (line _NL)+
block : blockname _SINGLE_NL (tline _NL)+
blockname : /[a-zA-Z-]+::/
tline : " " /[^\n]+/
cline : /[^\n]+/
line : /[^\n={2}\/{2}]+/
_NL : /(\r?\n[\t ]*)+/
_SINGLE_NL : /([\t ]*\r?\n)/
"""
parser = Lark(GRAMMAR)
tree = parser.parse(TEXT)
print(tree.pretty())
I have the following bad output.
start
comments
cline Comment #1
block
blockname abrev::
tline ok = ok
heading
ruler =========
title One title
ruler =========
paragraph
line Bla, bla
line Bli, Bli
line verbatim:: <<< BAD
line OK, or not ok ? <<< BAD
line That is... <<< BAD
line ...the question. <<< BAD
line Blu, Blu <<< BAD
heading
ruler ==
title 10
ruler ==
comments
cline Comment #2
paragraph
line Blo, blo
heading
ruler ==
title 05
ruler ==
I would like to obtain something like:
start
comments
cline Comment #1
block
blockname abrev::
tline ok = ok
heading
ruler =========
title One title
ruler =========
paragraph
line Bla, bla
line Bli, Bli
block <<< GOOD
blockname verbatim:: <<< GOOD
tline OK, or not ok ? <<< GOOD
tline That is... <<< GOOD
tline ...the question. <<< GOOD
paragraph <<< GOOD
line Blu, Blu <<< GOOD
heading
ruler ==
title 10
ruler ==
comments
cline Comment #2
paragraph
line Blo, blo
heading
ruler ==
title 05
ruler ==