8

So, I'm learning Haskell at the moment, and I would like to confirm or debunk my understanding of monoid.

What I figured out from reading CIS194 course is that monoid is basically "API" for defining custom binary operation on custom set.

Than I went to inform my self some more and I stumbled upon massive ammount of very confusing tutorials trying to clarify the thing, so I'm not so sure anymore.

I have decent mathematical background, but I just got confused from all the metaphors and am looking for clear yes/no answer to my understanding of monoid.

Reygoch
  • 1,204
  • 1
  • 11
  • 24
  • 3
    A monoid is an algebraic structure (S) with an associative binary operation over it ((.) : S * S -> S) and an identity element (id : S) for that operation. It must satisfy the laws of associativity (i.e. a . (b . c) = (a . b) . c) and identity (id . x = x . id = x). – Aadit M Shah Sep 02 '15 at 17:37

3 Answers3

8

From Wikipedia:

In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.

I think your understanding is correct. From a programming perspective, Monoid is an interface with two "methods" that must be implemented.

The only piece that seems to be missing from your description is the "identity", without which you are describing a Semigroup.

Anything that has a "zero" or an "empty" and a way of combining two values can be a Monoid. One thing to note is that it may be possible for a set/type to be made a Monoid in more than one way, for example numbers via addition with identity 0, or multiplication with identity 1.

ryachza
  • 4,460
  • 18
  • 28
  • Ye, I figured that part about rules it has to satisfy from mathematical definition, but I thought there might be more to monoids in Haskell than that since all tutorials I've found go to such amazing lengths and metaphors to explain them. – Reygoch Sep 02 '15 at 17:41
  • 2
    @Reygoch The convoluted tutorials probably exist because people are often scared of math, and so the writers feel like they need kid gloves to handle monoids just because the name is scary. One explanation of monoids that tries not to mystify things is [the one at Wikibooks](https://en.wikibooks.org/wiki/Haskell/Monoids) (disclaimer: I am a contributor there). – duplode Sep 02 '15 at 17:55
  • 4
    "Anything that has a "zero" or an "empty" and a way of combining two values is a monoid." The way of combining must also be associative. – jberryman Sep 02 '15 at 18:52
  • @jberryman Agreed from a mathematical perspective. But from a programming perspective, I think it's more like *really, really should* so as to avoid surprising implementations. – ryachza Sep 02 '15 at 19:02
  • 1
    @ryachza If the implementation is surprising in such a way, you have a `Monoid` that is not a monoid. – duplode Sep 02 '15 at 19:10
  • @duplode Totally agree. I was just looking to clarify the term "must". It would be awesome if associativity could be enforced by the compiler, but unfortunately that's not the case. – ryachza Sep 02 '15 at 19:13
  • @ryachza The standard way of distinguishing those things is to be careful about the difference between `Monoid` (capitalized) and monoid (not). With that in mind, I would object to "Anything that has a 'zero' or an 'empty' and a way of combining two values is a monoid." but not to "Anything that has a 'zero' or an 'empty' and a way of combining two values can be a `Monoid`.". (I also changed "is" to "can be" -- for the obvious reason.) – Daniel Wagner Sep 02 '15 at 20:00
  • @DanielWagner You're right I've updated the answer to hopefully prevent confusion for future readers. I had been using "Monoid" to refer to the interface/class, and "monoid" to refer to things that could be made an instance of that class (pragmatically, not mathematically) and reworded it to only refer to the interface. – ryachza Sep 02 '15 at 20:13
  • 1
    `Monoid` instances must obey the monoid laws up to the fundamental notion of equivalence for the type in question. In most cases (e.g., queue/sequence concatenation, priority queue melding, or set union) this will not be structural equality. However, failing to define a proper monoid in this weakened sense will make anyone using your library hate you. – dfeuer Sep 03 '15 at 04:38
5

from Wolfram:

A monoid is a set that is closed under an associative binary operation and has an identity element I in S such that for all a in S, Ia=aI=a.

from Wiki:

In abstract algebra, a branch of mathematics, a monoid is an algebraic structure with a single associative binary operation and an identity element.

so your intuition is more or less right.

You should only keep in mind that it's not defined for a "custom set" in Haskell but a type. The distinction is small (because types in type theory are very similar to sets in set theory) but the types for which you can define a Monoid instance need not be types that represent mathematical sets.

In other words: a type describes the set of all values that are of that type. Monoid is an "interface" that states that any type that claims to adhere to that interface must provide an identity value, a binary operation combining two values of that type, and there are some equations these should satisfy in order for all generic Monoid operations to work as intended (such as the generic summation of a list of monoid values) and not produce illogical/inconsistent results.

Also, note that the existence of an identity element in that set (type) is required for a type to be an instance of the Monoid class.

For example, natural numbers form a Monoid under both addition (identity = 0):

0 + n = n
n + 0 = n

as well as multiplication (identity = 1):

1 * n = n
n * 1 = n

also lists form a monoid under ++ (identity = []):

[] ++ xs = xs
xs ++ [] = xs

also, functions of type a -> a form a monoid under composition (identity = id)

id . f = f
f . id = f

so it's important to keep in mind that Monoid isn't about types that represents sets but about types when viewed as sets, so to say.


as an example of a malconstructed Monoid instance, consider:

import Data.Monoid

newtype MyInt = MyInt Int deriving Show

instance Monoid MyInt where
  mempty = MyInt 0
  mappend (MyInt a) (MyInt b) = MyInt (a * b)

if you now try to mconcat a list of MyInt values, you'll always get MyInt 0 as the result because the identity value 0 and binary operation * don't play well together:

λ> mconcat [MyInt 1, MyInt 2]
MyInt 0
Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
3

At a basic level you're right - it's just an API for a binary operator we denote by <>.

However, the value of the monoid concept is in its relationship to other types and classes. Culturally we've decided that <> is the natural way of joining/appending two things of the same type together.

Consider this example:

{-# LANGUAGE OverloadedStrings #-}

import Data.Monoid

greet x = "Hello, " <> x

The function greet is extremely polymorphic - x can be a String, ByteString or Text just to name a few possibilities. Moreover, in each of these cases it does basically what you expect it to - it appends x to the string `"Hello, ".

Additionally, there are lots of algorithms which will work on anything that can be accumulated, and those are good candidates for generalization to a Monoid. For example consider the foldMap function from the Foldable class:

foldMap ::  Monoid m => (a -> m) -> t a -> m

Not only does foldMap generalize the idea of folding over a structure, but I can generalize how the accumulation is performed by substituting the right Monoid instance.

If I have a foldable structure t containing Ints, I can use foldMap with the Sum monoid to get the sum of the Ints, or with Product to get the product, etc.

Finally, using <> affords convenience. For instance, there is an abundance of different Set implementations, but for all of them s <> t is always the union of two sets s and t (of the same type). This enables me to write code which is agnostic of the underlying implementation of the set thereby simplifying my code. The same can be said for a lot of other data structures, e.g. sequences, trees, maps, priority queues, etc.

ErikR
  • 51,541
  • 9
  • 73
  • 124