2

I am working on a Happy parser for a language with the following types, and many more.

type :: { ... }
type :
    'void'                { ... }
  | type '*'              { ... } {- pointer    -}
  | type '(' types ')'    { ... } {- function   -}
  | ...                           {- many more! -}

types :: { ... }
    {- empty -}           { ... }
  | types ',' type        { ... }

The language has apparently ambiguous syntax for calls.

callable :: { ... }
callable :
    type                   operand    { ... }    {- return type -}
  | type '(' types ')' '*' operand    { ... }    {- return and argument types -}

The second rule does not have the same meaning as the first when type takes on the type of a function pointer.

The ambiguity can be removed by adding a special rule for a type that isn't a function pointer. Barring doing so and duplicating all of the type definitions to produce something like

callable :: { ... }
callable :
    typeThatIsNotAFunctionPointer operand    { ... }
  | type '(' types ')' '*'        operand    { ... }

How can I specify that the alternative type operand is only legal when the type '(' types ')' '*' operand alternative fails?

There are many questions on stack overflow about why a grammar has ambiguities (I found at least 7), and some about how to remove an ambiguity, but none about how to specify how to resolve an ambiguity.

Undesirable Solution

I'm aware that I can refactor the grammar for types to a giant convoluted mess.

neverConstrainedType :: { ... }
neverConstrainedType :
    'int'                 { ... }
  | ...                   {- many more! -}

voidType :: { ... }
voidType :
    'void'

pointerType :: { ... }
pointerType :
    type '*'              { ... } {- pointer    -}

functionType :: { ... }
    type '(' types ')'    { ... } {- function    -}

type :: { ... }
type :
    neverConstrainedType  { ... }
  | voidType              { ... }
  | pointerType           { ... }
  | functionType          { ... }

typeNonVoid :: { ... }    {- this already exists -}
typeNonVoid : 
    neverConstrainedType  { ... }
  | pointerType           { ... }
  | functionType          { ... }

typeNonPointer :: { ... }
typeNonPointer :
    neverConstrainedType  { ... }
  | voidType              { ... }
  | functionType          { ... }

typeNonFunction :: { ... }
typeNonFunction :
    neverConstrainedType  { ... }
  | voidType              { ... }
  | functionType          { ... }

typeNonFunctionPointer :: { ... }
typeNonFunctionPointer :
    typeNonPointer        { ... }
  | typeNonFunction '*'   { ... }

And then define callable as

callable :: { ... }
callable :
    typeNonFunctionPointer                    operand    { ... }
  | type                   '(' types ')' '*'  operand    { ... }
Cirdec
  • 24,019
  • 2
  • 50
  • 100
  • what can operand look like? – ErikR Jan 14 '15 at 19:19
  • also, this is not material to your question, but I'm curious about your definition of types. A list of types appears to begin with a comma - is that right? – ErikR Jan 14 '15 at 19:22
  • @user5402 Neither of those details matters. It should be sufficient to know that `types` is always exactly the same everywhere it appears as is `operand`. In reality `types` is defined so that it can't begin with a `,`. It's essentially `types' : type | types' ',' type` and `types : {- empty -} | types'`. – Cirdec Jan 14 '15 at 19:40

1 Answers1

2

Basically you have what's called a shift/reduce conflict. You can google "resolve shift/reduce conflict" for more info and resources.

The basic idea in resolving shift/reduce conflicts is to refactor the grammar. For instance, this grammar is ambiguous:

%token id comma int
A : B comma int
B : id     
  | id comma B

The shift/reduce conflict can be eliminated by refactoring it as:

A : B int
B : id comma
  | id comma B

In your case you could try something like this:

type : simple               {0}
     | func                 {0}
     | funcptr              {0}

simple : 'void'             {0}
       | simple '*'         {0}
       | funcptr '*'        {0}

func : type '(' type ')'    {0}

funcptr : func '*'          {0}

The idea is this:

  • simple matches any type that is not a function or function pointer
  • func matches any function type
  • funcptr matches any function pointer type

That said, many of the things I've attempted to do in grammars I've found are better accomplished by analyzing the parse tree after it's been created.

ErikR
  • 51,541
  • 9
  • 73
  • 124