0

I want to write a PEG parser with PackCC (but also peg/leg or other libraries are possible) which is able to calculate some fields with variables on random position. The first simplified approach is the following grammar:

%source {                                                                      
  int vars[256];                                                                
}                                                            

statement <- e:term EOL      { printf("answer=%d\n", e); }                     

term <- l:primary 
      ( '+' r:primary        { l += r; }                                   
      / '-' r:primary        { l -= r; }
      )*                     { $$ = l; }                               
      / i:var '=' s:term     { $$ = vars[i] = s; }                             
      / e:primary            { $$ = e; }             

primary <- < [0-9]+ >        { $$ = atoi($1); }              
         / i:var !'='        { $$ = vars[i]; }                                    

var <- < [a-z] >             { $$ = $1[0]; }                                      

EOL    <- '\n' / ';'                                              

%%     

When testing with sequential order, it works fine:

a=42;a+1
answer=42
answer=43

But when having the variable definition behind the usage, it fails:

a=42;a+b;b=1
answer=42
answer=42
answer=1

And even deeper chained late definitions shall work, like:

a=42;a+b;b=c;c=1
answer=42
answer=42
answer=0
answer=1

Lets think about the input not as a sequential programming language, but more as a Excel-like spreadsheet e.g.:

A1: 42
A2: =A1+A3
A3: 1

Is it possible to parse and handle such kind of text with a PEG grammar?

Is two-pass or multi-pass an option here?

Or do I need to switch over to old style lex/yacc flex/bison?

rici
  • 234,347
  • 28
  • 237
  • 341
Achim
  • 442
  • 1
  • 3
  • 13

2 Answers2

1

I'm not familiar with PEG per se, but it looks like what you have is an attributed grammar where you perform the execution logic directly within the semantic action.

That won't work if you have use before definition.

You can use the same parser generator but you'll probably have to define some sort of abstract syntax tree to capture the semantics and postpone evaluation until you've parsed all input.

  • So this means going back to lex/yacc and splitting scanning from parsing. After scanning storing all tokens in a tree and parse the tree by checking each token for validity (whether it is already done) - buuh, sounds very compex – Achim Dec 04 '15 at 14:01
0

Yes, it is possible to parse this with a PEG grammar. PEG is effectively greedy LL(*) with infinite lookahead. Expressions like this are easy.

But the grammar you have written is left recursive, which is not PEG. Although some PEG parsers can handle left recursion, until you're an expert it's best to avoid it, and use only right recursion if needed.

david.pfx
  • 10,520
  • 3
  • 30
  • 63
  • Hi David, thanks for your answer! Meanwhile I've switched from PackCC (which allows left recursion) to peg/leg (which don't). So I was forced to change the grammar avoiding left recursion, like I edited in the original posting above. But still I found no way to lookahead, could you give any hint how to achieve this? I would prefer a peg solution much over building a token tree and jumping around for parsing the fields in the right order. Thanks Achim – Achim Dec 29 '15 at 21:59
  • PEG parsers all (?) provide some kind of predicate to test tokens without consuming them. For peg the construct is `&` see http://piumarta.com/software/peg/peg.1.html. But the proper way to use SO is to just ask another question... – david.pfx Dec 31 '15 at 04:02