2

In Regular Expressions, I can write:

a(.)*b

And this will match the entire string in, for example

acdabb

I try to simulate this with a token stream in Happy.

t : a wildcard b
wildcard : {- empty -} | wild wildcard
wild : a | b | c | d | whatever

However, the parser generated by Happy does not recognize

acdabb

Is there a way around this/am I doing it wrong?

Jonathan Gallagher
  • 2,115
  • 2
  • 17
  • 31
  • The shift/reduce error can be eliminated by converting the above into a left recursive form (wildcard : | wildcard wild). I didn't think that left or right recursion mattered for LALR parsers. However, I am still curious if I am missing something. – Jonathan Gallagher Mar 17 '14 at 22:22
  • The reason is the bound on the number of lookaheads that Happy uses. The right recursive version is LALR(1). – Jonathan Gallagher Mar 17 '14 at 22:29
  • I meant the left recursive version is LALR(1), the right recursive version is not; sorry about that. – Jonathan Gallagher Mar 18 '14 at 02:32

1 Answers1

1

As you noted Happy uses an LALR(1) parser, which is noted in the documentation. You noted in the comments that changing to right recursion resolves the problem, but for the novice it might not be clear how that can be achieved. To change the recursion the wilcard wild is rewritten as wild wildcard, which results in the following file:

{
module ABCParser (parse) where
}

%tokentype { Char }

%token a { 'a' }
%token b { 'b' }
%token c { 'c' }
%token d { 'd' }
%token whatever { '\n' }

%name parse t

%%

t 
 : a wildcard b
  { }

wildcard 
  : 
   { }  
  | wildcard wild
     { }

wild 
   : a 
    { }
   | b 
     { }
   | c 
     { }
   | d
     { }
   | whatever
     { }

Which now generates a working parser.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129