0

I have a combined ANTLR grammar, which shall be used for parsing several lines of information. It is possible that at the time of writing the grammar, not all lines are already completely known and defined within the grammar. This shall be recognized. Following example is a simplified one:

rule:    (line)+ EOF;
LF:      ('\n'|'\r\n');
WS:      ' ';

INTEGER: ('0'..'9');
VALUE:   ('a'..'z'|'A'..'Z'|'0'..'9');

line:    'car' WS VALUE WS LF (subline LF)*;
subline: '>' (description | id | type | unknownsubline);

description: ('description' WS VALUE);
id:          ('id' WS INTEGER);
type:        ('type' WS VALUE);

unknownsubline:          (VALUE | WS | INTEGER)*;   /*don't known yet -> shall be logged...*/

I put the following input:

car car1
>description redPorsche
>id 123
>type pkw
>tires 4
>specifica fast,car
car car2
>description blueTruck

The line >tires 4 is recognized successfully within the ANTLR interpreter within Eclipse. But the next line >specifica fast,carcauses an NoViableAltException due to the word car is an already defined token, which is here used within an unknown context.

Is there any possibility to avoid this behaviour? Is it possible to recognise VALUE that contain already defined tokens?

  • PEG parsers like [parboiled](https://github.com/sirthias/parboiled/wiki) solve this kind of problems naturally. If you're not bound to Java, you can also take a look at [Grako](https://github.com/sirthias/parboiled/wiki). – Apalala Mar 16 '13 at 15:59
  • Sorry @Apalala, I am bound to Java for this time. But thanks for the hint. – user2124486 Mar 18 '13 at 12:03
  • *parboiled* **is** Java (or Scala)... – Apalala Mar 18 '13 at 16:45

1 Answers1

1

Don't make 'car' a keyword. Use a syntactic action instead:

line : car WS VALUE WS LF (subline LF)*;

car : id=VALUE {$id.text == "car"}? ();

Note that your definition of VALUE seems to be missing a + at the end.

Apalala
  • 9,017
  • 3
  • 30
  • 48
  • I also had an error because of using id as a var. Id is a rule in my grammar. Should it be `car: any=VALUE {$any.text == "car"=>()}`? – user2124486 Mar 19 '13 at 08:11
  • Note that the semantics for `{}?` are different from those for `{}=>`. If I remember correctly, `{}?` will produce unrecoverable errors because the parser cannot how to recover from a semantic action. – Apalala Mar 19 '13 at 14:25