2

I'm attempting to parse permutations of flags. The behavior I want is "one or more flags in any order, without repetition". I'm using the following packages:

  • megaparsec
  • parser-combinators

The code I have is outputting what I want, but is too lenient on inputs. I don't understand why it's accepting multiples of the same flags. What am I doing wrong here?

pFlags :: Parser [Flag]
pFlags = runPermutation $ f <$> 
    toPermutation (optional (GroupFlag <$ char '\'')) <*> 
    toPermutation (optional (LeftJustifyFlag <$ char '-'))
    where f a b = catMaybes [a, b]

Examples:

"'-" = [GroupFlag, LeftJustifyFlag] -- CORRECT
"-'" = [LeftJustifyFlag, GroupFlag] -- CORRECT
"''''-" = [GroupFlag, LeftJustifyFlag] -- INCORRECT, should fail if there's more than one of the same flag.
danielbeard
  • 9,120
  • 3
  • 44
  • 58
  • I suspect `toPermutation` doesn't play well with `optional`. Perhaps you could eschew permutation parsing, simply parse a list of `some` `Flag`s, and just check that they're all different after the fact. – danidiaz Dec 18 '21 at 14:44
  • @danidiaz I wanted to avoid having to do that, but yeah I can try that. – danielbeard Dec 18 '21 at 15:07

1 Answers1

2

Instead of toPermutation with optional, I believe you need to use toPermutationWithDefault, something like this (untested):

toPermutationWithDefault Nothing (Just GroupFlag <$ char '\'')

The reasoning is described in the paper “Parsing Permutation Phrases” (PDF) in §4, “adding optional elements” (emph. added):

Consider, for example […] all permutations of a, b and c. Suppose b can be empty and we want to recognise ac. This can be done in three different ways since the empty b can be recognised before a, after a or after c. Fortunately, it is irrelevant for the result of a parse where exactly the empty b is derived, since order is not important. This allows us to use a strategy similar to the one proposed by Cameron: parse nonempty constituents as they are seen and allow the parser to stop if all remaining elements are optional. When the parser stops the default values are returned for all optional elements that have not been recognised.

To implement this strategy we need to be able to determine whether a parser can derive the empty string and split it into its default value and its non-empty part, i.e. a parser that behaves the same except that it does not recognise the empty string.

That is, the permutation parser needs to know which elements can succeed without consuming input, otherwise it will be too eager to commit to a branch. I don’t know why this would lead to accepting multiples of an element, though; perhaps you’re also missing an eof?

Jon Purdy
  • 53,300
  • 8
  • 96
  • 166
  • That approach compiles, but the behavior appears to be the same as the `optional` approach. – danielbeard Dec 19 '21 at 17:43
  • 1
    Actually, there is a different behavior I just didn't notice initially since the order isn't preserved by default (something I can look at later), thanks! – danielbeard Dec 19 '21 at 18:02