0

I would like to parse both f(arg).method and f(arg) as block_statement; the first has more priority than the latter.

The following elements in parser.mly can't parse f(arg), but can parse f(arg).method as follows:

  (* f(arg).method *)
  BS_MAE MAE_LE_UN (
    LE_IE IE_LE_AL (
      LE_SNE SNE_I f,
      AL_I arg),
    UN_I method)

(* parser.mly: *)

block_statement:
| member_access_expression { BS_MAE $1 }

simple_name_expression: | IDENTIFIER { SNE_I $1 }
member_access_expression: | l_expression DOT unrestricted_name { MAE_LE_UN ($1, $3) }
unrestricted_name: | IDENTIFIER { UN_I $1 }
index_expression: | l_expression LPAREN argument_list RPAREN { IE_LE_AL ($1, $3) }
expression: | l_expression { E_LE $1 }

l_expression:
| simple_name_expression { LE_SNE $1 } 
| index_expression { LE_IE $1 } 

call_statement: 
| simple_name_expression argument_list { CallS_SNE_AL ($1, $2) }
| member_access_expression argument_list { CallS_MAE_AL ($1, $2) }

argument_list: | IDENTIFIER { AL_I $1 }

But if we append another line | IDENTIFIER LPAREN expression RPAREN { BS_I_E ($1, $3) } for block_statement, this time it can parse f(arg) as follows:

  BS_I_E (
    f,
    E_LE LE_SNE SNE_I arg)

However, this time, f(arg).method can't be parsed anymore. It raises an error after reading .

I don't know how to let the parser go a little bit further to read f(arg).method as a whole if possible; I really need the parser to parse both of the statements... Could anyone help?

SoftTimur
  • 5,630
  • 38
  • 140
  • 292

1 Answers1

1

I would try a grammar with a structure along the lines of:

block:
| expr

expr:
| expr LPAREN argument_list RPAREN
| expr DOT unrestricted_name
| simple_expr

simple_expr:
| IDENTIFIER

Note that if you want to parse a full sentence, and not just a valid prefix of the input, your toplevel rule should request the EOF token to be present (to force the parser to go to the end of the input):

%start <block> main

main:
| b=block EOF { b }
gasche
  • 31,259
  • 3
  • 78
  • 100
  • Thank you :-)... Are you sure that `f(arg).method` could be well parsed? `.method` must be read as `DOT unrestricted_name`, thus `f(arg)` must be read as `member_access_expr`, but it is only the other branch of `call_expr` which contains parentheses... – SoftTimur Jan 24 '14 at 23:25
  • Indeed, I started with something parsing your example, made a change, and the resulting grammar was wrong. I updated it to conflate the two levels to allow parsing. Note that you have to have a top parsing rule that goes upto EOF (otherwise you may get a prefix of the input that is a valid parse), and that `f(x).y(z)` will get parsed as `(f(x).y)(z)`, as I supposed it should be. – gasche Jan 25 '14 at 12:08
  • Could you please elaborate a little bit more about `you have to have a top parsing rule that goes upto EOF (otherwise you may get a prefix of the input that is a valid parse)`? Besides, I don't see `EOF` in your grammar either... I guess that is the key to solve my problem: according to my grammar `f(arg).method` can't be parsed because `f(arg)` as a prefix is a valid parse... – SoftTimur Jan 26 '14 at 03:14
  • I edited my answer to add an example of EOF requirement. – gasche Jan 26 '14 at 12:36
  • Actually, the problem is my whole grammar is complex (I have already removed lots of branches to make this representative in this post) and it is not `LR(1)`, thus it is not easy to restructure it... But I still have to modify it to increase the programs it deal with... I have found another way to treat the specific problem of this post by making subsets of identifiers, but there is another problem I post [here](http://stackoverflow.com/questions/21384853/try-the-first-rule-if-the-second-rule-fails)... – SoftTimur Jan 27 '14 at 16:39