2

When implementing real world (TM) languages, I often encounter a situation like this:

(* language Foo *)
type A = ... (* parsed by parse_A *)
type B = ... (* parsed by parse_B *)
type collection = { as : A list ; bs : B list }

(* parser ParseFoo.mly *)
parseA : ... { A ( ... ) }
parseB : ... { B ( ... ) }

parseCollectionElement : parseA { .. } | parseB { .. }

parseCollection : nonempty_list (parseCollectionElement) { ... }

Obviously (in functional style), it would be best to pass the partially parsed collection record to each invocation of the semantic actions of parseA and parseB and update the list elements accordingly.

Is that even possible using menhir, or does one have to use the ugly hack of using a mutable global variable?

choeger
  • 3,562
  • 20
  • 33

1 Answers1

1

Well, you're quite limited in what you're allowed to do in menhir/ocamlyacc semantic actions. If you find this really frustrating, you can try parsec-like parsers, e.g. mparser, that allow you to use OCaml in your rules to full extent.

My personal aproach to this kind of problems is to stay on a most primitive level in the parser, without trying to define anything sophisticated, and lift parser output to higher level later.

But your case looks simple enough to me. Here, instead of using parametrized menhir rule, you can just write a list rule by hand, and produce a collection in its semantic rule. nonempty_list is a syntactic sugar, that, as any other sugars, works in most cases, but in general is less general.

ivg
  • 34,431
  • 2
  • 35
  • 63
  • That sounds reasonable. Instead of passing the partially parsed collection forward on shift, I can obviously also update it on reduce on a custom recursive rule... I'll look into it and let you know, if it works out. – choeger Jan 05 '15 at 12:40