0

Having the subsequent simple grammar, I'd like to simultaneously parse strings and numbers:

grammar Simple;

aRule : 'fs' '(' value["textual"] ')' ;
bRule : 'fi' '(' value["numeral"] ')' ;
cRule : 'f' '(' (value["textual"] | value["numeral"]) ')' ;    

value[String k]
  : {$k.equals("any") || $k.equals("textual")}? string 
  | {$k.equals("any") || $k.equals("numeral")}? numeric 
  ;

string
  : STRING_LITERAL
  ;

numeric
  : ('+' | '-')? INTEGER_LITERAL
  ;

STRING_LITERAL
  : '\'' (~('\'' | '\r' | '\n') | '\'' '\'' | NEWLINE)* '\''
  ;

INTEGER_LITERAL
  : '0' | [1-9] [0-9]* 
  ;

SPACES
  : [ \t\r\n]+ -> skip
  ;

fragment NEWLINE                   : '\r'? '\n';

Now, I'd like to parse the following expressions:

fs('asdf') // works
fi(512)    // works
f('asdf')  // works
f(512)     // fails

If I switch textual and numeral in cRule, then f('asdf') fails and f(512) works.

Any ideas?

UPDATE1

grammar Simple;

rules : aRule | bRule | cRule ;
aRule : 'fs' '(' value["textual"] ')' ;
bRule : 'fi' '(' value["numeral"] ')' ;
cRule : 'f' '(' (tRule | nRule) ')' ;
tRule : value["textual"] ;
nRule : value["numeral"] ;
value[String k]
  : {$k.equals("any") || $k.equals("textual")}? string 
  | {$k.equals("any") || $k.equals("numeral")}? numeric 
  ;
string  : STRING_LITERAL ;
numeric : ('+' | '-')? INTEGER_LITERAL ;

STRING_LITERAL  : '\'' (~('\'' | '\r' | '\n') | '\'' '\'' | NEWLINE)* '\'' ;
INTEGER_LITERAL : '0' | [1-9] [0-9]* ;
SPACES          : [ \t\r\n]+ -> skip ;

fragment NEWLINE : '\r'? '\n';

Even w/ this updated grammar --- as suggest by @GRosenberg --- for f(512) I still get no viable alternative at input '512'. Again fs('asdf'), fi(512) and f('asdf') work.

nemron
  • 701
  • 6
  • 23
  • 1
    *"that does not work"* => this is meaningless, tell us *what* and *how* it does not work, please see [mcve]. Also, *why* are you convoluting your grammar like that? Why not doing simply: `either: string | numeric;` and `aRule: A_STR_FUNC '(' string ')';` and so on? – Lucas Trzesniewski Sep 15 '16 at 15:36
  • This is just a simple excerpt of a much more complicated grammar. In the full grammar I have to restrict expressions at some places to be only of "type" string, int, or date time, etc.. So, to ensure that at those places not /any/ expression is allowed, I wanted to follow this approach. – nemron Sep 15 '16 at 15:42
  • Looks like [context-dependent predicates don't play too well with prediction](https://github.com/antlr/antlr4/blob/master/doc/predicates.md#using-context-dependent-predicates). I *still* think what you're trying to do is achievable without predicates though. – Lucas Trzesniewski Sep 15 '16 at 16:27
  • Thanks for the link! And yes, it certainly is possible, but would be much more verbose. – nemron Sep 15 '16 at 16:29
  • 2
    A different alternative is to accept any data type at the grammar level, and then perform a validation pass afterward (with a visitor for instance). After all, passing an unexpected parameter type is a semantic error, not a syntactic one. This simplifies the grammar a lot, as you don't have to track types in it, just accept everything that's syntactically well-formed and deal with the issues later in code. – Lucas Trzesniewski Sep 15 '16 at 16:32

2 Answers2

1

I agree with Lucas, this is a totally overcomplicated grammar. If you want to accept only a string then, by all means, specify only string in your grammar. Why using a value rule with all different options and limiting it then to a single one? That's a typical shot in the foot. Instead do like this:

rules : aRule | bRule | cRule ;
aRule : 'fs' '(' string ')' ;
bRule : 'fi' '(' numberic ')' ;
cRule : 'f' '(' (tRule | nRule) ')' ;
tRule : string ;
nRule : numeric ;

It's also much easier to read if you spell out what you want your language to look out, instead of trying to parameterize some generic rule.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181
0

The relevant generated code is listed below. In it, notice that a predicate constraint failure results in an exception. This explains why the second alt is always skipped -- either the first alt is correct or the whole rule fails.

Whether the generated code is somehow 'improper' is a question probably best directed to the ANTLR4 Google group or Github issues page.

In any case, the solution is to separate using subrules:

cRule   : F LPAREN ( dRule | eRule ) RPAREN EOF ;
dRule   : value["textual"] ;
eRule   : value["numeral"] ;

UPDATE

The value rule is the one that needs to be split into subrules:

cRule   : F LPAREN ( valuet["textual"] | valuen["numeral"] ) RPAREN EOF ;

valuet[String k]
    : {$k.equals("any") || $k.equals("textual")}? string
    ;

valuen[String k]
    : {$k.equals("any") || $k.equals("numeral")}? numeric
    ;

Tested/works.


public final ValueContext value(String k) throws RecognitionException {
    ValueContext _localctx = new ValueContext(_ctx, getState(), k);
    enterRule(_localctx, 2, RULE_value);
    try {
        setState(21);
        _errHandler.sync(this);
        switch (getInterpreter().adaptivePredict(_input, 1, _ctx)) {
            case 1:
                enterOuterAlt(_localctx, 1); {
                setState(17);
                if (!(_localctx.k.equals("any") || _localctx.k.equals("textual")))
                    throw new FailedPredicateException(this, "$k.equals(\"any\") || $k.equals(\"textual\")");
                setState(18);
                string();
            }
                break;
            case 2:
                enterOuterAlt(_localctx, 2); {
                setState(19);
                if (!(_localctx.k.equals("any") || _localctx.k.equals("numeral")))
                    throw new FailedPredicateException(this, "$k.equals(\"any\") || $k.equals(\"numeral\")");
                setState(20);
                numeric();
            }
                break;
        }
    } catch (RecognitionException re) {
        _localctx.exception = re;
        _errHandler.reportError(this, re);
        _errHandler.recover(this, re);
    } finally {
        exitRule();
    }
    return _localctx;
}
GRosenberg
  • 5,843
  • 2
  • 19
  • 23
  • When I tried to repro this (in C#), the error I was getting didn't come from there, but from `adaptivePredict`, and it was a `NoViableAltException`, so it's *not* a predicate failure. – Lucas Trzesniewski Sep 15 '16 at 18:05
  • I tried your suggestion, but unfortunately it does not work. See the updated grammar. – nemron Sep 15 '16 at 18:54
  • Interesting. OP did not specify, so likely using the default Java generator. It produces predicate *constraints* on the alts, as shown, that are separate from and in addition to the actual predicate evaluation. May be the C# implementation builds this into its `adaptivePredict`. – GRosenberg Sep 15 '16 at 18:55
  • Weird is the fact, that `cRule : 'f' '(' (tRule) ')' | 'f' '(' (nRule) ')' ;` also does not accept `f('asdf')` and `f(512)`. – nemron Sep 15 '16 at 19:06
  • Updated & tested. – GRosenberg Sep 15 '16 at 19:16
  • Yes, but the goal of the entire exercise was to have just *one* `value` rule. Otherwise the parameter and semantic predicate is superfluous, say, then one should use `string` and `numeric` directly. – nemron Sep 16 '16 at 06:03
  • No intent to argue, but hard to see the practical value in having just *one* `value` rule. Perhaps, after explaining why that is of critical importance, an alternate approach might be offered as a better solution. If just an exercise, then the best conclusion is to use multiple rules. – GRosenberg Sep 16 '16 at 16:33