Haskell: performance of IORefs

Question

I have been trying to encode an algorithm in Haskell that requires using lots of mutable references, but it is (perhaps not surprisingly) very slow in comparison to purely lazy code. Consider a very simple example:

module Main where

import Data.IORef
import Control.Monad
import Control.Monad.Identity

list :: [Int]
list = [1..10^6]

main1 = mapM newIORef list >>= mapM readIORef >>= print
main2 = print $ map runIdentity $ map Identity list

Running GHC 7.8.2 on my machine, main1 takes 1.2s and uses 290MB of memory, while main2 takes only 0.4s and uses a mere 1MB. Is there any trick to prevent this growth, especially in space? I often need IORefs for non-primitive types unlike Int, and assumed that an IORef would use an additional pointer much like a regular thunk, but my intuition seems to be wrong.

I have already tried a specialized list type with an unpacked IORef, but with no significant difference.

Your edit should really be an answer to the question, and perhaps a separate question for the `Control.Monad.ST.Lazy` bit. — Ganesh Sittampalam, Jun 05 '14 at 21:59

Gabriella Gonzalez · Accepted Answer · 2014-06-06T22:04:34.760

15

The problem is your use of mapM, which always performs poorly on large lists both in time and space. The correct solution is to fuse away the intermediate lists by using mapM_ and (>=>):

import Data.IORef
import Control.Monad

list :: [Int]
list = [1..10^6]

main = mapM_ (newIORef >=> readIORef >=> print) list

This runs in constant space and gives excellent performance, running in 0.4 seconds on my machine.

Edit: In answer to your question, you can also do this with pipes to avoid having to manually fuse the loop:

import Data.IORef
import Pipes
import qualified Pipes.Prelude as Pipes

list :: [Int]
list = [1..10^6]

main = runEffect $
    each list >-> Pipes.mapM newIORef >-> Pipes.mapM readIORef >-> Pipes.print

This runs in constant space in about 0.7 seconds on my machine.

edited Jun 06 '14 at 22:04

answered Jun 05 '14 at 23:20

Gabriella Gonzalez

34,863
3
77
135

By the way, does the **pipes** library allow to specify this sort of behaviour (without fusing the two `mapM`s)? – hpacheco Jun 06 '14 at 20:31
@hpacheco Yes. I updated my answer with the equivalent `pipes` solution. I actually wrote the solution in `pipes` first and then used equational reasoning to transform it into the equivalent hand-written loop. – Gabriella Gonzalez Jun 06 '14 at 22:06
Hmm, but this still kinds of maps over `each` element on the list, as if it was a single map. For instance, how would print the resulting list instead of each element on its own? – hpacheco Jun 06 '14 at 22:49
@hpacheco If you want to print the resulting list, then use `Pipes.toListM` instead of `Pipes.print`. That will fold the pipeline's output into a list, but more efficiently than `mapM`. – Gabriella Gonzalez Jun 07 '14 at 01:58

luqui · Answer 2 · 2014-06-06T09:45:54.973

14

This is very likely not about IORef, but about strictness. Actions in the IO monad are serial -- all previous actions must complete before the next one can be started. So

mapM newIORef list

generates a million IORefs before anything is read.

However,

map runIdentity . map Identity
= map (runIdentity . Identity)
= map id

which streams very nicely, so we print one element of the list, then generate the next one, etc.

If you want a fairer comparison, use a strict map:

map' :: (a -> b) -> [a] -> [b]
map' f [] = []
map' f (x:xs) = (f x:) $! map' f xs

edited Jun 06 '14 at 09:45

answered Jun 05 '14 at 19:22

luqui

59,485
12
145
204

Thanks. Now that I look into it, I have probably simplified the problem too much, and it is not very surprising if GHC is actually simplifying the second `map`. But anyway, a function like `main = mapM (\r -> newIORef r >>= readIORef) list >>= print` still uses considerable time and space. If this is about strictness, then I wouldn't a **lazy** `ST` monad improve significantly? Something like `main = print $ runST (mapM newSTRef list >>= mapM readSTRef)`. For the record, it is even slower that using `IORef`s. – hpacheco Jun 05 '14 at 19:58

score 2 · Answer 3 · answered Jun 05 '14 at 23:04

2

I have found that the hack towards a solution is to use a lazy mapM instead, defined as

lazyMapM :: (a -> IO b) -> [a] -> IO [b]
lazyMapM f [] = return []
lazyMapM f (x:xs) = do
  y <-  f x
  ys <- unsafeInterleaveIO $ lazyMapM f xs
  return (y:ys)

This allows the monadic version to run within the same 1MB and similar time. I would expect that a lazy ST monad could solve this problem more elegantly without using unsafeInterleaveIO, as a function:

main = print $ runST (mapM (newSTRef) list >>= mapM (readSTRef))

but that does not work (you also need to use unsafeInterleaveST), what leaves me thinking about how lazy the Control.Monad.ST.Lazy really is. Does someone know? :)

answered Jun 05 '14 at 23:04

hpacheco

235
1
8

1

`ST.Lazy` is not as lazy as it seems it could be. All operations on the state are still sequential -- i.e. the moment you `newSTRef`, `readSTRef`, etc. the entire state history is forced. It's lazy in the sense that not depending on the state can operate lazily; e.g. foo = fmap (1:) foo will give an infinite list, not bottom. It's a bit disappointing, I know... – luqui Jun 06 '14 at 01:24
(Of course, it can't actually be any lazier than that; consider if one of the intermediate operations changed the variable you are reading? We have to force them all to make sure.) – luqui Jun 06 '14 at 01:26
Sorry, I have already created a new [question](http://stackoverflow.com/questions/24072934/haskell-how-lazy-is-the-lazy-control-monad-st-lazy-monad) on this but only noticed your reply now. Can this answer be posted there instead? – hpacheco Jun 06 '14 at 01:51
But you could at least only force the whole history at the time you read. I do not see a reason for `mapM (newSTRef) list >>= mapM (readSTRef)` not being more lazy than it is... Ignoring the history once you force the result is what does not seem possible. – hpacheco Jun 06 '14 at 02:28
1

`newSTRef` and `readSTRef` theoretically get along with laziness just fine. `writeSTRef` doesn't. We have to force all the `newSTRef`s the first time we `readSTRef`, because we don't know in advance that one of those wasn't a `writeSTRef` of a previously created `STRef` -- the semantics can't see that the first `mapM` only has `newSTRef`s in it. Does that answer your question? – luqui Jun 06 '14 at 02:33
1

It does. Summing it up, we could *safely* group/delay sequences of `newSTRef`s and `readSTRef`s, but as soon as we can have `writeSTRef`s in play we have to replay the whole history. – hpacheco Jun 06 '14 at 03:03

Haskell: performance of IORefs

3 Answers3

Linked