4

I am working with some data that has missing values, which is simply represented as lists of Maybe values. I would like to perform various aggregates/statistical operations, which simply ignore the missing values.

This is related to the following questions:

Idiomatic way to sum a list of Maybe Int in haskell

How to use the maybe monoid and combine values with a custom operation, easily?

However, the former question is content with returning Nothing if any value is missing, which is not an option in my case. I have a solution which involves creating a Num instance for Maybe. However, that means it is specific to addition and multiplication and it has some other problems, too.

instance Num a => Num (Maybe a) where
  negate      = fmap negate
  (+)         = liftA2 (+)
  (*)         = liftA2 (*)
  fromInteger = pure . fromInteger
  abs         = fmap abs
  signum      = fmap signum

Based on that, we can do something like this:

maybeCombineW :: (a -> a -> a) -> Maybe a -> Maybe a -> Maybe a
maybeCombineW f (Just x)  (Just y)  = Just (f x y)
maybeCombineW _ (Just x)  Nothing   = Just x
maybeCombineW _ Nothing   (Just y)  = Just y
maybeCombineW _ Nothing   Nothing   = Nothing


maybeCombineS :: (a -> a -> a) -> Maybe a -> Maybe a -> Maybe a
maybeCombineS f (Just x)  (Just y)  = Just (f x y)
maybeCombineS _ _          _        = Nothing


class (Num a) => Num' a where
  (+?) :: a -> a -> a
  (*?) :: a -> a -> a
  (+!) :: a -> a -> a
  (*!) :: a -> a -> a
  (+?) = (+)
  (*?) = (*)
  (+!) = (+)
  (*!) = (*)

instance {-# OVERLAPPABLE  #-} (Num a) => Num' a
instance {-# OVERLAPPING  #-} (Num' a) => Num' (Maybe a) where
  (+?) = maybeCombineW (+?)
  (*?) = maybeCombineW (*?)
  (+!) = maybeCombineS (+!)
  (*!) = maybeCombineS (*!)


sum' :: (Num' b, Foldable t) => t b -> b
sum' = foldr (+?) 0

sum'' :: (Num' b, Foldable t) => t b -> b
sum'' = foldr (+!) 0

What I like about this: It gives me two functions, a lenient sum' and a strict sum'', from which I can choose as needed. I can use the same functions to sum any Num instances, so I can reuse the same code for lists without Maybe without converting them first.

What I don't like about this: The instance overlap. Also, for any operation other than addition and multiplication, I have to specify a new type class and make new instances.

Therefore, I was wondering whether it is somehow possible to get a nice and general solution, perhaps along the lines suggested in the second question, which treats Nothing as the mempty for any operation in question.

Is there a nice idiomatic way of doing this?

Edit: Here is the best solution so far:

inout i o = ((fmap o) . getOption) . foldMap (Option . (fmap i))
sum' = Sum `inout` getSum
min' = Min `inout` getMin
-- etc.
Community
  • 1
  • 1
kloffy
  • 2,928
  • 2
  • 25
  • 34

2 Answers2

6

There is an instance of Monoid that does exactly the right thing:

instance Monoid a => Monoid (Maybe a) where
  mempty = Nothing
  Nothing `mappend` m = m
  m `mappend` Nothing = m
  Just m1 `mappend` Just m2 = Just (m1 `mappend` m2)

It's in Data.Monoid.

Thus,

foldMap (liftA Sum) [Just 1, Nothing, Just 2, Nothing, Just 3] = 
   fold [Just (Sum 1), Nothing, Just (Sum 2), Nothing, Just (Sum 3)] = 
      Just (Sum 6)

For strict left-folding versions, instead of fold one can use foldl' mappend mempty, and instead of foldMap f use foldl' (mappend . f) mempty. In the Maybe monoid, mempty is Nothing.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • Cool, thanks for pointing that out, that definitely seems more idiomatic. So I could write `sum' = (liftA getSum) . foldMap (liftA Sum)`. However, this means that if I do want to reuse some code that uses `sum'` where there are no missing values, I still have to wrap the values in Maybe first, right? (Not saying that it would be terrible if that were the case, just checking...) – kloffy Oct 06 '16 at 10:46
  • Also, how would I go about using this approach with another operation, such as for example getting the minimum/maximum value of the list? (And also suppose that I wanted both, a lenient and a strict version, like in my example. If you could show how to do that, I would be happy to accept the answer!) – kloffy Oct 06 '16 at 10:47
  • 1
    Not sure if it's possible to apply the same function to `[a]` and `[Maybe a]` and get the same type of result. What if `a` is `Maybe b`? – n. m. could be an AI Oct 06 '16 at 11:03
  • You can use `liftA Product` for example. `Min` is not a monoid (what's a minimum of the empty list?) but you can write your own Min-like wrapper so that `Maybe (MyMin a)` is a monoid, and use `liftA MyMin` for a foldable of maybes. – n. m. could be an AI Oct 06 '16 at 11:15
  • Yes, fair point about the type ambiguity, I suppose that's at the root of the overlapping instances in my solution. You are also right about `Min` not being a `Monoid`, so there is another thing I probably should not have put in the title. The point is that it doesn't have to be a Monoid, because on an empty list the result can be `Nothing`. Your answer has helped me understand the problem better, and I have now implemented a pretty good solution using `Data.Semigroup` (see edit). Unless any other clever solution comes up, I will accept it. – kloffy Oct 06 '16 at 11:55
  • @kloffy If you believe you have a better solution, the correct way to indicate this is to write it as an answer and accept it, not to edit the solution into the question. (Answering your own question is not considered impolite here on SO.) – Daniel Wagner Oct 06 '16 at 20:25
  • @DanielWagner Thanks for pointing that out! I still think this is the right answer to accept, since the code in my edit is a variation on the basic approach suggested here (just a slightly more general one). I just wanted to document it for others that might be interested. – kloffy Oct 07 '16 at 04:34
1

How about just using catMaybes from Data.Maybe to discard all the Nothing values? Then you can run any aggregations and calculations on a list of just plain values.

shang
  • 24,642
  • 3
  • 58
  • 86
  • Yes, I was aware of `catMaybes`, but it didn't quite feel like it was getting to the root of the problem. For example, suppose instead of just a plain sum I wanted to do a cumulative sum and just keep adding zeros for the Nothing elements? (Although, in that case I suppose I shouldn't have said fold in the tile, my mistake.) – kloffy Oct 06 '16 at 11:11