Start by looking at the definition of Parser
:
newtype Parser a = Parser {parse :: String -> [(a,String)]}`
A Parser a
is really just a wrapper around a function (that we can run later with parse
) that takes a String
and returns a list of pairs, where each pair contains an a
encountered when processing the string, along with the rest of the string that remains to be processed.
Now look at the part of the code in chainl1
that's confusing you: the part where you extract f
from op
:
f <- op
You remarked: "I do not get how you extract a function from the parser when it should return a list of tuples."
It's true that when we run a Parser a
with a string (using parse
), we get a list of type [(a,String)]
as a result. But this code does not say parse op s
. Rather, we are using bind
here (with the do-notation syntactic sugar). The problem is that you're thinking about the definition of the Parser
datatype, but you're not thinking much about what bind
specifically does.
Let's look at what bind
is doing in the Parser
monad a bit more carefully.
bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ \s -> concatMap (\(a, s') -> parse (f a) s') $ parse p s
What does p >>= f
do? It returns a Parser
that, when given a string s
, does the following: First, it runs parser p
with the string to be parsed, s
. This, as you correctly noted, returns a list of type [(a, String)]
: i.e. a list of the values of type a
encountered, along with the string that remained after each value was encountered. Then it takes this list of pairs and applies a function to each pair. Specifically, each (a, s')
pair in this list is transformed by (1) applying f
to the parsed value a
(f a
returns a new parser), and then (2) running this new parser with the remaining string s'
. This is a function from a tuple to a list of tuples: (a, s') -> [(b, s'')]
... and since we're mapping this function over every tuple in the original list returned by parse p s
, this ends up giving us a list of lists of tuples: [[(b, s'')]]
. So we concatenate (or join) this list into a single list [(b, s'')]
. All in all then, we have a function from s
to [(b, s'')]
, which we then wrap in a Parser
newtype.
The crucial point is that when we say f <- op
, or op >>= \f -> ...
that assigns the name f
to the values parsed by op
, but f
is not a list of tuples, b/c it is not the result of running parse op s
.
In general, you'll see a lot of Haskell code that defines some datatype SomeMonad a
, along with a bind
method that hides a lot of the dirty details for you, and lets you get access to the a
values you care about using do-notation like so: a <- ma
. It may be instructive to look at the State a
monad to see how bind
passes around state behind the scenes for you. Similarly, here, when combining parsers, you care most about the values the parser is supposed to recognize... bind
is hiding all the dirty work that involves the strings that remain upon recognizing a value of type a
.