It sounds like you are worried about some of the down sides of lazy evaluation. You want to ensure your fold, loop, recursion is handled in constant memory.
The iteratee libraries were created solve this problem,
pipes,
conduit,
enumerator,
iteratee,
iterIO.
The most popular and also recent are
pipes and
conduit. Both of which go beyond the iteratee model.
The
pipes library focuses on being theoretically sound in an effort to eliminate bugs and to allow the constancy of design open up efficient yet high levels of abstraction(my words not the authors). It also offers bidirectional streams if desired which is a benefit so far unique to the library.
The
conduit is not quite as theoretically as well founded as pipes but has the large benefit of currently having more associated libraries built on it for parsing and handling http streams, xml streams and more. Check out the conduit section at hackage in on the packages page. It is used
yesod one of Haskell's larger and well known web frameworks.
I have enjoyed writing my streaming applications with pipes library in particular the ability to make proxy transformer stacks. When I have needed to fetch a web page or parse some xml I have been using the conduit libraries.
I should also mention
io-streams which just did its first official release. It's aim is at IO in particular, no surprise it is in its name, and utilizing simpler type machinery, fewer type parameters, then
pipes or
conduit. The major down side is that you are stuck in the IO monad so it is not very helpful to pure code.
{-# language NoMonoMorphismRestriction #-}
import Control.Proxy
Start with simple translation.
map (+1) [1..10]
becomes:
runProxy $ mapD (+1) <-< fromListS [1..10]
The iteratee like offerings a little more verbose for simple translations, but offer large wins with larger examples.
A example of a proxy, pipes library, that generates fibonacci numbers in constant sapce
fibsP = runIdentityK $ (\a -> do respond 1
respond 1
go 1 1)
where
go fm2 fm1 = do -- fm2, fm1 represents fib(n-2) and fib(n-1)
let fn = fm2 + fm1
respond fn -- sends fn downstream
go fm1 fn
These could streamed to the stdout with
runProxy $ fibsP >-> printD -- printD prints only the downstream values, Proxies are the bidirectional offer of the pipes package.
You should check out the proxy tutorial and the conduit tutorial which I just found out is now at FP Complete's school of Haskell.
One method to find the mean would be:
> ((_,l),s) <- (`runStateT` 0) $ (`runStateT` 0) $ runProxy $ foldlD' ( flip $ const (+1)) <-< raiseK (foldlD' (+)) <-< fromListS [1..10::Int]
> let m = (fromIntegral . getSum) s / (fromIntegral . getSum) l
5.5
Now it is easy to add map or filter the proxy.
> ((_,l),s) <- (`runStateT` 0) $ (`runStateT` 0) $ runProxy $ foldlD' ( flip $ const (+1)) <-< raiseK (foldlD' (+)) <-< filterD even <-< fromListS [1..10::Int]
edit: code rewritten to take advantage of the state monad.
update:
On more method of doing multiple calculation over a large stream of data in a compassable fashion then writing direct recursion is demonstrated in the blog post beautiful folding. Folds are turned into data and combined while using a strict accumulator. I have not used this method with any regularity, but it does seem to isolate where strictness is required making it easier to apply. You should also look at an answer to another question similar question that implements the same method with applicative and may be easier to read depending on your predilections.