How do the requirements for the instances of the Applicative type class relate to their implementations for Functor

Question

According to Haskell's library documentation, every instance of the Applicative class must satisfy the four rules:

identity: pure id <*> v = v
composition: pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
homomorphism: pure f <*> pure x = pure (f x)
interchange: u <*> pure y = pure ($ y) <*> u

It then says that as a consequence of these rules, the underlying Functor instance will satisfy fmap f x = pure f <*> x. But since the method fmap does not even appear in the above equations, how exactly does this property follow from them?

There are extra constraint that `liftA2 f x y = f <$> x <*> y` and `(<*>) = liftA2 id`. Since `(<$>)` is `fmap`, it is thus mentioned. — Willem Van Onsem, Jan 04 '20 at 23:22
@WillemVanOnsem But wouldn't `fmap f x = pure f <*> x` be true even if `liftA2` didn't exist? After all, you can define an instance in terms of `liftA2` and then `<$>` really doesn't appear in any of the definitions, but it'll still hold. — Joseph Sible-Reinstate Monica, Jan 04 '20 at 23:41
@WillemVanOnsem Yes but I tried to rewrite ```pure f <*> x``` using this rule and got ```pure f <*> x = liftA2 id (pure f) x = id <$> (pure f) <*> x = pure f <*> x``` which didn't bring me anywhere. The last equation follows from the requirement ```fmap id = id``` in the Functor class. — XiaohuWang, Jan 05 '20 at 16:48

K. A. Buhr · Accepted Answer · 2020-01-05T08:41:11.880

Update: I've substantially expanded the answer. I hope it helps.

"Short" answer:

For any functor F, there is a "free theorem" (see below) for the type:

(a -> b) -> F a -> F b

This theorem states that for any (total) function, say foo, with this type, the following will be true for any functions f, f', g, and h, with appropriate matching types:

If f' . g = h . f, then foo f' . fmap g = fmap h . foo f.

Note that it is not at all obvious why this should be true.

Anyway, if you set f = id and g = id and use the functor law fmap id = id, this theorem simplifies to:

For all h, we have foo h = fmap h . foo id.

Now, if F is also an applicative, then the function:

foo :: (a -> b) -> F a -> F b
foo f x = pure f <*> x

has the right type, so it satisfies the theorem. Therefore, for all h, we have:

pure h <*> x
-- by definition of foo
= foo h x
-- by the specialized version of the theorem
= (fmap h . foo id) x
-- by definition of the operator (.)
= fmap h (foo id x)
-- by the definition of foo
= fmap h (pure id <*> x)
-- by the identity law for applicatives
= fmap h x

In other words, the identity law for applicatives implies the relation:

pure h <*> x = fmap h x

It is unfortunate that the documentation does not include some explanation or at least acknowledgement of this extremely non-obvious fact.

Longer answer:

Originally, the documentation listed the four laws (identity, composition, homomorphism, and interchange), plus two additional laws for *> and <* and then simply stated:

The Functor instance should satisfy
fmap f x = pure f <*> x

The wording above was replaced with the new text:

As a consequence of these laws, the Functor instance for f will satisfy
fmap f x = pure f <*> x

as part of commit 92b562403 in February 2011 in response to a suggestion made by Russell O'Connor on the libraries list.

Russell pointed out that this rule was actually implied by the other applicative laws. Originally, he offered the following proof (the link in the post is broken, but I found a copy on archive.org). He pointed out that the function:

possibleFmap :: Applicative f => (a -> b) -> f a -> f b
possibleFmap f x = pure f <*> x

satisfies the Functor laws for fmap:

pure id <*> x = x {- Identity Law -}

pure (f . g) <*> x
= {- infix to prefix -}
pure ((.) f g) <*> x
= {- 2 applications of homomorphism law -}
pure (.) <*> pure f <*> pure g <*> x
= {- composition law -}
pure f <*> (pure g <*> x)

and then reasoned that:

So, \f x -> pure f <*> x satisfies the laws of a functor. Since there is at most one functor instance per data type, (\f x -> pure f <*> x) = fmap.

A key part of this proof is that there is only one possible functor instance (i.e., only one way of defining fmap) per data type.

When asked about this, he gave the following proof of the uniqueness of fmap.

Suppose we have a functor f and another function
foo :: (a -> b) -> f a -> f b
Then as a consequence of the free theorem for foo, for any f :: a -> b and any g :: b -> c.
foo (g . f) = fmap g . foo f
In particular, if foo id = id, then
foo g = foo (g . id) = fmap g . foo id = fmap g . id = fmap g

Obviously, this depends critically on the "consequence of the free theorem for foo". Later, Russell realized that the free theorem could be used directly, together with the identity law for applicatives, to prove the needed law. That's what I've summarized in my "short answer" above.

Free Theorems...

So what about this "free theorem" business?

The concept of free theorems comes from a paper by Wadler, "Theorems for Free". Here's a Stack Overflow question that links to the paper and some other resources. Understanding the theory "for real" is hard, but you can think about it intuitively. Let's pick a specific functor, like Maybe. Suppose we had a function with the following type;

foo :: (a -> b) -> Maybe a -> Maybe b
foo f x = ...

Note that, no matter how complex and convoluted the implementation of foo is, that same implementation needs to work for all types a and b. It doesn't know anything about a, so it can't do anything with values of type a, other than apply the function f, and that just gives it a b value. It doesn't know anything about b either, so it can't do anything with a b value, except maybe return Just someBvalue. Critically, this means that the structure of the computation performed by foo -- what it does with the input value x, whether and when it decides to apply f, etc. -- is entirely determined by whether x is Nothing or Just .... Think about this for a bit -- foo can inspect x to see if it's Nothing or Just someA. But, if it's Just someA, it can't learn anything about the value someA: it can't use it as-is because it doesn't understand the type a, and it can't do anything with f someA, because it doesn't understand the type b. So, if x is Just someA, the function foo can only act on its Just-ness, not on the underlying value someA.

This has a striking consequence. If we were to use a function g to change the input values out from under foo f x by writing:

foo f' (fmap g x)

because fmap g doesn't change x's Nothing-ness or Just-ness, this change as no effect on the structure of foo's computation. It behaves the same way, processing the Nothing or Just ... value in the same way, applying f' in exactly the same circumstances and at exactly the same time that it previously applied f, etc.

This means that, as long as we've arranged things so that f' acting on the g-transformed value gives the same answer as an h-transformed version of f acting on the original value -- in other words if we have:

f' . g = h . f

then we can trick foo into processing our g-transformed input in exactly the same way it would have processed the original input, as long as we account for the input change after foo has finished running by applying h to the output:

foo f' (fmap g x) = fmap h (foo f x)

I don't know whether or not that's convincing, but that's how we get the free theorem:

If f' . g = h . f then foo f' . fmap g = fmap h . foo f.

It basically says that because we can transform the input in a way that foo won't notice (because of its polymorphic type), the answer is the same whether we transform the input and run foo, or run foo first and transform its output instead.

Thank you for your very detailed answer. I have not yet gone through everything you referenced, but I think I got the basic idea. So if I understand correctly, the free theorems are established based on how you can define functions in Haskell, and since all functions must be defined using a certain syntax, they must have some certain properties. However, if I were to define a function externally (e.g. in C), could I then maybe use the FFI and inspect the structures of its arguments, and thus define a function that violates the free theorems? — XiaohuWang, Jan 06 '20 at 17:25
Yes, the above assumes that `fmap` doesn't "cheat". If you did want to cheat, you wouldn't need to use the FFI and external functions. If you just write an `fmap` that doesn't terminate, that violates the law; and you could undoubtedly use unsafe functions like `unsafeCoerce` to write an `fmap` that misbehaves. — K. A. Buhr, Jan 07 '20 at 04:41
I see. Although what I mean is that by "cheating" we could write two distinct implementations of `fmap` which both follow the law, that is they both satisfy `fmap id = id` and `fmap (f . g) = fmap f . fmap g`, but yet also behave differently than each other. In this case, an implementation that doesn't terminate wouldn't count, since it wouldn't satisfy `fmap id = id`. — XiaohuWang, Jan 07 '20 at 14:00

How do the requirements for the instances of the Applicative type class relate to their implementations for Functor

1 Answers1

"Short" answer:

Longer answer:

Free Theorems...