14

I want to write a function that goes through a list updating an accumulator until that accumulator reaches a certain condition or I get to the end of the list. For example, a product function that stops as soon as its accumulator reaches zero.

I know how to code it by writing the recursion by hand:

{-# LANGUAGE BangPatterns #-}

prod :: [Integer] -> Integer
prod xs =
    go 1 xs
  where
    go 0   _       = 0
    go !acc []     = acc
    go !acc (x:xs) = go (acc * x) xs

but is there a way to code this using folds and other higher order functions?


One thing that comes to mind is defining

mult 0 _ = 0
mult x y = x * y

and then using foldl'. However, this doesn't break out early so its a bit wasteful of performance.

We can't use foldr since it goes through the list in the wrong order and its way of "breaking out early" is by looking at the elements of the list instead of looking at the accumulator (this would have mattered if the accumulator had a different type than the list elements).

hugomg
  • 68,213
  • 24
  • 160
  • 246
  • You could use `Monoids` and `mconcat` – viorior Jan 20 '14 at 16:08
  • @viorior: That doesn't work in the general case where the accumulator and list elements have different types. – hugomg Jan 20 '14 at 16:17
  • @missingno: Sure it does, just `fmap` into the result accumulator type before `mconcat`ing. – Tom Ellis Jan 20 '14 at 16:25
  • 2
    You _can_ use `foldr` for this example, because `foldr` is non-strict, as is your `mult` operator, so `foldr1 mult $ [1,2] ++ [0..]` gives you `0` very quickly. Perhaps you'd be better off with an example involving addition where you stop as soon as the accumulator's zero, so you can't tell from the data that it will be: you want to ask for `mult _ 0 = 0` not the other way round if you want it to short circuit on the accumulator, as `foldr :: (a -> b -> b) -> b -> [a] -> b` - the second argument is the accumulator. – not my job Jan 20 '14 at 17:43
  • I'm not sure I see why it's an issue that you'd break out of `foldr` by looking at the elements instead of the accumulator. In this problem, the only way the accumulator can be `0` is if one (or more) of the elements is `0`. – David Young Jan 20 '14 at 20:40
  • 1
    @DavidYoung: In my real problem the only way is to look at the accumulator because the function is not commutative and assoociative like `*` is. I guess I shouldnt have chosen "product" as an example. – hugomg Jan 20 '14 at 21:59
  • @DavidYoung For example, if the accumulator is your cash whilst gambling, and you add a positive or negative number depending on whether you win or lose. When that total is zero or less, it's game over and you stop there, but the data in the list doesn't show that: `-1` may bankrupt you and `-100` might not. 0 is rather unique with multiplication because `a*b=0` implies `a=0` or `b=0` ("there are no zero divisors"), so it's OK to test the individual elements instead of the accumulator. I initially answered with `foldr` but had to delete it when I realised I wasn't testing the accumulator. – not my job Jan 21 '14 at 08:54
  • 2
    BTW you don't need the bang pattern `!acc` because you pattern match on it against `0`. This will force it, on its own. – Will Ness Jan 21 '14 at 12:47

3 Answers3

19

One simple way is to do the computation in a monad that allows to escape early, such as Either or Maybe:

{-# LANGUAGE BangPatterns #-}
import Data.Functor ((<$))
import Data.Maybe (fromMaybe)
import Control.Monad

prod :: [Integer] -> Integer
prod = fromMaybe 0 . prodM

-- The type could be generalized to any MonadPlus Integer
prodM :: [Integer] -> Maybe Integer
prodM = foldM (\ !acc x -> (acc * x) <$ guard (acc /= 0)) 1

At each step of the computation we check if the accumulator is nonzero. And if it's zero, guard invokes mplus, which exits the computation immediately. For example the following exits immediately:

> prod ([1..5] ++ [0..])
0
Petr
  • 62,528
  • 13
  • 153
  • 317
5

It seems that scanl is the simplest list combinator that gives you what you want. For example this won't evaluate the 1 to 10 of the second list.

Prelude> let takeUntil _ [] = []; takeUntil p (x:xs) = if p x then [x] else (x: takeUntil p xs)
Prelude> (last . takeUntil (==0) . scanl (*) 1) ([1..10] ++ [0..10])
0

takeUntil doesn't seem to exist in the standard library. It's like takeWhile but also gives you the first failing element.

If you want to do this properly you should take care with that partial last. If you want a powerful general solution I guess mapAccumL is the way to go.

Tom Ellis
  • 9,224
  • 1
  • 29
  • 54
  • Exactly! `scanl` gives us the left-to-right flow, and `foldr` - an early breakout: `takeUntil p = foldr (\x r-> head([[x]|p x]++[r])) []`. -- `last` is not partial here, since `scanl` always produces a non-empty list. :) – Will Ness Jan 21 '14 at 12:36
  • and of course where `scanl` fits the bill, so will `iterate`, and `unfoldr`. `unfoldr` will even stop on its own. – Will Ness Jan 21 '14 at 12:55
  • `last . takeUntil p` could be rewritten as `head . takeWhile (not.p)` with the same effect. – not my job Jan 21 '14 at 13:18
  • @WillNess: Indeed the partiality of `last` isn't hit here, but one should ideally write this in a way that doesn't use `last` and where the totality is obvious by construction. – Tom Ellis Jan 21 '14 at 14:56
  • @chunksOf50: No, that will typically just take the first element. – Tom Ellis Jan 21 '14 at 14:58
  • @TomEllis one could argue that in `last ... take ... scanl` the totality is obvious by construction. :) but that's a matter of taste of course. :) (indeed it is not *immediate*...) – Will Ness Jan 21 '14 at 15:01
  • 1
    @chunksOf50 you meant `dropWhile (not.p) >>> head`... but that's not correct either (what if `p` never holds?). – Will Ness Jan 21 '14 at 15:04
  • @TomEllis BTW `mapAccumL` can't stop on its own. Perhaps you meant `unfoldr`. – Will Ness Jan 21 '14 at 15:11
  • @WillNess You're right I meant `head . dropWhile (not.p)`, and you're also right that I'm wrong - my function isn't total. (If there _is_ an element satisfying it, they're the same.) – not my job Jan 21 '14 at 17:08
1

This post is already a bit older but I'd like to mention a general way to break fold with the additional possibility to return not only a default value or the accumulator but also the index and element of the list where the breaking condition was satisfied:

import Control.Monad(foldM)

breakFold step initialValue list exitCondition exitFunction = 
  either id (exitFunction (length  list) (last list)) (foldM f initialValue (zip [0..] list))
    where f acc (index,x)
            | exitCondition index x acc = Left (exitFunction index x acc)
            | otherwise = Right (step index x acc)

It only requires to import foldM. Examples for the usage are:

mysum thresh list = breakFold (\i x acc -> x + acc) 0 list (\i x acc -> x + acc > thresh) (\i x acc -> acc)
myprod thresh list = breakFold (\i x acc -> x * acc) 1 list (\i x acc -> acc == thresh) (\i x acc -> (i,x,acc))

returning

*myFile> mysum 42 [1,1..]
42
*myFile> myprod 0 ([1..5]++[0,0..])
(6,0,0)
*myFile> myprod 0 (map (\n->1/n) [1..])
(178,5.58659217877095e-3,0.0)

In this way, one can use the index and the last evaluated list value as input for further functions.

exchange
  • 363
  • 4
  • 9