How to optimze grammar for Lark parser

Question

My goal is to create a parser which can be used to parse DBC files. The syntax of the fileformat is given here.

To achive this I choose the Lark-parser based on the excellent JSON parser tutorial.

I created a grammar file based on the above linked document and started the standalone parser generator script. Unfortunately it is still running after 30 minuetes and the process now consumes 7,5 Gb of memory. This is of couse not acceptable.

What could I do, so that it sucessfully generates the parser code?

That's not supposed to happen. Are you sure you're using the `parser="lalr"` option? — Erez, Apr 29 '20 at 07:10
Sure, I reduced the problem to one rule only. I was just about to open an issue on github. — handras, Apr 29 '20 at 07:18

score 0 · Answer 1 · answered Apr 29 '20 at 10:29

It turned out that the issue is caused by too many optional terimnals in a single rule. It can be fixed by breaking up the rule to many more, but each with less optional terminals.

Eg.:

start : [ "a" ["b"]  ["b1"]  ["b2"]  ["b3"] ["c"]  ["c1"]  ["c2"]  ["c3"] ["d"]  ["d1"]  ["d2"]  ["d3"] ["e"]  ["e1"]  ["e2"]  ["e3"] ["f"]  ["f1"]  ["f2"]  ["f3"]]

replaced by:

start : [ "a" b c d e f ]
b: ["b"]  ["b1"]  ["b2"]  ["b3"]
c: ["c"]  ["c1"]  ["c2"]  ["c3"]
d: ["d"] ["d1"]  ["d2"]  ["d3"] 
e:  ["e"]  ["e1"]  ["e2"]  ["e3"]
f:  ["f"]  ["f1"]  ["f2"]  ["f3"]

See github issue

How to optimze grammar for Lark parser

1 Answers1