11

I have a list like this:

let foo = [Just 1, Just 2, Nothing, Just 3, Nothing, Nothing]

By using catMaybes I can extract only the Just-constructed values:

catMaybes foo -- [1,2,3]

I'm now looking for a function that not only yields a list of Justs but also a count of Nothings for a finite list by traversing it once. It should have a signature like this:

catMaybesCount :: [Maybe a] -> ([a], Int)

Note: This question was answered Q&A-style and therefore intentionally does not show any research effort!

Uli Köhler
  • 13,012
  • 16
  • 70
  • 120

6 Answers6

23
import Data.Monoid
import Data.Foldable

catMaybesCount = foldMap inject where
    inject Nothing  = ([ ], Sum 1)
    inject (Just x) = ([x], Sum 0)
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • 1
    Like the simplicity. +1 – David Unric Aug 27 '14 at 19:59
  • 1
    Definitive the most elegant solution, even though I like all of them +1 – Uli Köhler Aug 27 '14 at 20:22
  • Definitely the way to go. You can golf that down to `(id *** getSum) . foldMap (maybe ([], Sum 1) (pure &&& mempty))` (and who knows what else?), but the pattern matching is probably more readable. – pigworker Aug 28 '14 at 09:41
  • 2
    This answer deserves to be expanded upon. The key ideas are: (a) `catMaybes` and `count` are both folds, (b) [folds are made of monoids](http://byorgey.wordpress.com/2012/11/05/foldr-is-made-of-monoids/), and (c) the product of two monoids is a monoid (`instance (Monoid a, Monoid b) => Monoid (a, b)`. – Luis Casillas Aug 28 '14 at 18:16
5

We can have a left folding for strict counting and a right folding for lazy list building at the same time:

catMC :: (Num t) => [Maybe a] -> ([a], t)
catMC xs = g 0 xs 
  where 
    g !c (Nothing:xs) = g (c+1) xs 
    g !c (Just  v:xs) = let (a,b)=g c xs in (v:a,b) 
    g !c []           = ([],c)

This works for infinite lists too, as long as we don't access the count field (snd) of the result, while calculating the count in a strict, efficient manner, much like a mutable accumulator variable.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
4

My choice would be just a foldr:

import Control.Arrow

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount = foldr (maybe (second succ) (first . (:))) ([], 0)

There are pros and cons to both left and right folds in this case, as right fold makes the list result properly lazy and efficient, while strict left folding computes the length result more efficiently.

Will Ness
  • 70,110
  • 9
  • 98
  • 181
András Kovács
  • 29,931
  • 3
  • 53
  • 99
3

I'd use the Writer monad:

import Control.Arrow ( (***) )
import Data.Monoid ( Sum(..) )
import Control.Monad.Writer ( execWriter, tell )

catMaybesCount xs = (id *** getSum) $ execWriter (mapM_ go xs)
   where
      go (Just v) = tell ([v], Sum 0)
      go Nothing = tell ([], Sum 1)

The (++)s should right-associate given the definition of mapM_.

Ganesh Sittampalam
  • 28,821
  • 4
  • 79
  • 98
2

The most naive solution would be to simply do both evalutations independently:

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount xs = (catMaybes xs, length $ filter isNothing xs)

I don't know if GHC is able to optimize this properly, but the length . filter p solution for counting Nothings has some peculiarities anyway (see this SO post for an overview).

Theoretically, this solution could require two passes over the list, instead of the one

This is a recursive solution solving this issue I came up with:

import Data.Maybe

-- | Equivalent to @catMaybes@, but additonally counts @Nothing@ values
catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount xs =  catMaybesCountWorker xs [] 0

-- | Worker function for @catMaybesCount@   
catMaybesCountWorker :: [Maybe a] -> [a] -> Int -> ([a], Int)
catMaybesCountWorker [] justs cnt = (justs, cnt)
catMaybesCountWorker (Nothing:xs) justs cnt =
    catMaybesCountWorker xs justs (cnt + 1)
catMaybesCountWorker ((Just v):xs) justs cnt =
    catMaybesCountWorker xs (justs ++ [v]) cnt

As applying it to a list should evaluate the list only once, this should be more efficient.

However I am worried about the justs ++ [v] anti-idiom, as (:) would be more efficient (see this discussion). However, this would invert the resulting list. Maybe someone with more knowledge on this topic could have a look at it?

Note that this function won't terminate for infinite lists because the Nothing count will never finish to evaluate.

Community
  • 1
  • 1
Uli Köhler
  • 13,012
  • 16
  • 70
  • 120
2

For cases like this, the foldl package by Gabriel Gonzalez is very handy. You can simply use the predefined folds or define custom ones like below and combine them using an applicative interface:

import qualified Control.Foldl as L
import Control.Applicative ((<$>),(<*>))
import Data.Monoid
import qualified Data.DList as DL

catMaybesCount :: [Maybe a] -> ([a], Int)
catMaybesCount = L.fold $ (,) <$> elemJust <*> countJust

-- L.Fold :: (x -> a -> x) -> x -> (x -> b) -> L.Fold a b

elemJust :: L.Fold (Maybe a) [a]
elemJust = L.Fold step DL.empty DL.toList
  where step xs (Just x) = DL.snoc xs x
        step xs Nothing = xs

countJust :: L.Fold (Maybe a) Int
countJust = L.Fold step (Sum 0) getSum
  where step acc (Just _) = acc
        step acc Nothing = acc <> Sum 1
Markus1189
  • 2,829
  • 1
  • 23
  • 32