0

I am trying to write a parser grammar and currently have the following productions for an LL Grammar (in Antlr) and I am trying to parse one or more (numbers or strings) that is separated by a "#" that is right associative. How do I modify the the productions so that it can parse one or more strings separated by "#" instead of just one at the moment?

A ::= B
    | Number
    | String

B ::= C "->" A

C ::= Number
    | String

Examples of languages for this grammar:

ABC # 123
123 # ABC
ABC # DEF # 123
ABC # DEF # (123 # 456)
ABC # (DEF # 123) # 456

I tried using the EBNF form

A ::= B
    | Number
    | String
    | "(" A ")"

B ::= C ("#" A)?

C ::= Number
    | String

But that causes my Grammar to be ambiguous. How would I fix this ambiguity?

rlhh
  • 893
  • 3
  • 17
  • 32
  • `A::=(A)` is unlikely to be correct, even if you actually wrote `A::="("A")"`. The parenthesized expression is a primary (`C` in your grammar). – rici Mar 05 '16 at 06:08
  • How would I have parse something like ABC # (DEF # 123) if my parenthesized expression is C? – rlhh Mar 05 '16 at 06:25

2 Answers2

0

The ambiguity comes from the fact you can derive Number or String two ways -- either directly A -> Number, or A -> B -> C -> Number (and similarly for String). The obvious fix is to get rid of the direct productions:

A ::= B
    | "(" A ")"

B ::= C ("#" A)?

C ::= Number
    | String
Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
0

I think what you're looking for is quite a bit simpler:

A ::= B ( "#" B )*
B ::= Number | String | "(" A ")"

Not being an ANTLR pro, I'm not sure how you would go about marking # as right-associative, but the intent of the rule is to produce a list of Bs, so you could presumably associate them to the right in the semantic rule.

It's important to put the parenthesized expression rule at the bottom of the hierarchy (so to speak); otherwise, you wouldn't be able to parse ( first # second ) # third.

rici
  • 234,347
  • 28
  • 237
  • 341