Where does many produce the empty string?

Question

I am currently reading Programming in Haskell by Graham Hutton. I am stuck on the chapter of Parsers. In it there are two mutually recursive functions defined as:

many  p = many1 p +++ return []
many1 p = do v  <- p
             vs <- many p   
             return (v:vs)

Where many is actually transformed into this form:

many1 p = p >>= (\ v -> many p >>= (\ vs -> return (v : vs)))

The >>= operator is defined as:

p >>= f =  P (\inp -> case parse p inp of
                           []        -> []
                           [(v,out)] -> parse (f v) out)

The +++ operator is defined as:

p +++ q =  P (\inp -> case parse p inp of
                           []        -> parse q inp
                           [(v,out)] -> [(v,out)])

The other functions relevant to this question are these:

parse             :: Parser a -> String -> [(a,String)]
parse (P p) inp   =  p inp

sat p =  do x <- item
            if p x then return x else failure

digit =  sat isDigit

failure = P (\inp -> [])
item    = P (\inp -> case inp of
                          []     -> []
                          (x:xs) -> [(x,xs)])
return v = P (\inp -> [(v,inp)])

Now, when attempting to use many1 to parse digits from the string "a", like:

parse (many digit) "a"

the result is [("","a")].

When attempting to parse digits from the string "a" using many1 like:

parse (many1 digit) "a"

the result is [].

I think I understand why the second result. (many1 digit) attempts to parse the string "a", and so it calls digit "a" which fails since "a" is not a digit, and so the empty list is returned [].

However, I do not understand the first result when using (many digit). If (many1 digit) returns [] then obviously it failed, and so in the +++ operator, the second case expression is executed. But when I try to parse (return []) "a" the result I get back is [([], "a")].

I don't get it why the result of many is [("", "a")], when the result of many1 is []. Any help is appreciated.

P.S. I have seen this question already, but it doesn't give me the answer I am looking for.

When you talk about "manny" and "manny1" is that just a mis-spelling of "many" or do you mean something different? — Paul Johnson, Nov 15 '14 at 14:43
Bear in mind that "" == [] because literal strings are just synactic sugar for lists of characters. Hence [([], "a")] == [("", "a")] — Paul Johnson, Nov 15 '14 at 14:45
It's pretty easy: `many` means parse *zero* or more occurrences of a given item. The string `"a"` starts with zero digits hence the `""` at the beginning. — Bakuriu, Nov 15 '14 at 14:45

score 3 · Accepted Answer · answered Nov 15 '14 at 14:44

3

If your confusion is that you get back [("", "a")] when you expected [([], "a")]:

A string is a list of Chars. So "" is an empty list of Chars. Since [] is an empty list of any type, that means that "" is just a special case of []. In other words [] :: [Char] is completely equivalent to "".

So since your parser is expected to produce a string, the empty list is known to be of type [Char] and thus printed as "" instead of [].

answered Nov 15 '14 at 14:44

sepp2k

363,768
54
674
675

Thank you! That is it! So, it's just something the compiler interprets based on the type? right? I just tried :{ *| let test :: Parser String *| test = (failure +++ return []) *| :} *> parse (test) "a" [("","a")] And I get back the expected result. – Marin Nov 15 '14 at 16:22

Where does many produce the empty string?

1 Answers1