0

I want to parse a simple language which basically has a couple of special glyphs or characters in front of a line of text. If it doesn't have these, then the line of text is just taken as data.

For example :

+ hfflsdjf dslfhsldfh sdlfkh sdlfkhs 
! sdlfkhsdl sdfb sldflsdfh sldkfh sd
dsf sldfbbsf sdfjbs kfjbsd kjbsdf 

The first and second lines have special meanings because of the + and ! at the front, the rest of the line is data to that instruction. But the third line is just data.

How could I express this in Instaparse?

Basically I want to say any string that isn't matched by any of the other rules should be matched by the DATA terminal.

interstar
  • 26,048
  • 36
  • 112
  • 180
  • What did you try? I am not sure I would approach this with instaparse. It seems that you can decide how to treat your entire data based on the first character alone. A switch statement may be more appropriate for this particular problem. – Shlomi Nov 16 '18 at 21:35
  • It's actually more complicated. I simplified for this question. But the point is that there is a default category that things that don't parse fall into. – interstar Nov 16 '18 at 21:38

1 Answers1

1
(def as-and-bs
    (insta/parser
        "<text> = (rubbish | op ) *
         <op> = plus | bang
         <line> = #'[^\n]*(\n|$)'
         rubbish = line
         plus = '+' line
         bang = '!' line"))

(as-and-bs "+ abc\n! def\ncu ")
;=> ([:plus "+" " abc\n"] [:bang "!" " def\n"] [:rubbish "cu "])
akond
  • 15,865
  • 4
  • 35
  • 55