2

I am having some weird performance issues with evalRandIO. Here's the offending code:

import Control.Monad.Random

inf :: (RandomGen g, Random a) => Rand g [a]
inf = sequence $ repeat $ getRandom

many :: (RandomGen g, Random a) => Int -> Rand g [a]
many n = sequence $ replicate n $ getRandom

main = do
  m <- evalRandIO $ many 1000000 :: IO [Bool]
  i <- evalRandIO $ inf :: IO [Bool]
  putStrLn $ show $ take 5 m 
  putStrLn $ show $ take 5 i

This code will print 5 random bools and then overflows the stack. But, if I comment out either evalRandIO statement, like so:

main = do
  --m <- evalRandIO $ many 1000000 :: IO [Bool]
  i <- evalRandIO $ inf :: IO [Bool]
  --putStrLn $ show $ take 5 m 
  putStrLn $ show $ take 5 i

the code runs fine. What is happening?

My ghci output:

strontium:movie andrew$ ghci rtest.hs
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Ok, modules loaded: Main.
Prelude Main> main
Loading package syb-0.1.0.2 ... linking ... done.
Loading package base-3.0.3.2 ... linking ... done.
Loading package mtl-1.1.0.2 ... linking ... done.
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package time-1.1.4 ... linking ... done.
Loading package random-1.0.0.3 ... linking ... done.
Loading package MonadRandom-0.1.6 ... linking ... done.
[True,True,True,True,True]
[^CInterrupted.
Don Stewart
  • 137,316
  • 36
  • 365
  • 468
nitromaster101
  • 471
  • 3
  • 11

2 Answers2

3

The answer is simpler than it seems. When you write:

 m <- evalRandIO $ many 1000000 :: IO [Bool]

The next call to evalRandIO will need to evaluate the value of seed after 1000000 operations. Now that takes a lot of stack space.

So when inf is called it first have to evaluate the current seed value and then proceed with it's own calculations. If you change the seed before running inf your program will finish instantly:

import Control.Monad.Random

inf :: (RandomGen g, Random a) => Rand g [a]
inf = sequence $ repeat $ getRandom

many :: (RandomGen g, Random a) => Int -> Rand g [a]
many n = sequence $ replicate n $ getRandom

main = do
  newGen' <- newStdGen
  m <- evalRandIO $ many 1000000 :: IO [Bool]
  setStdGen newGen'
  i <- evalRandIO $ inf :: IO [Bool]
  putStrLn $ show $ take 5 m 
  putStrLn $ show $ take 5 i

Note that you newStdGen uses current seed value, so you have to run it before call to 'many'.

Tener
  • 5,280
  • 4
  • 25
  • 44
  • Yeah, I was thinking if it was the same problem as in http://stackoverflow.com/questions/3358913/is-mapm-in-haskell-strict-why-does-this-program-get-a-stack-overflow. – Volker Stolz Apr 19 '11 at 11:44
  • But that doesn't explain the stack overflow. Certainly repeatedly reseeding should use constant stack space. I'm wondering if there's a leak somewhere, but I couldn't make the profiling detailed enough. – Volker Stolz Apr 19 '11 at 14:27
0

Can't reproduce on MacOS X 10.6.4:

botanix:~ stolz$ ghci rand.hs 
GHCi, version 6.12.3: http://www.haskell.org/ghc/  :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
[1 of 1] Compiling Main             ( rand.hs, interpreted )
Ok, modules loaded: Main.
*Main> main
Loading package old-locale-1.0.0.2 ... linking ... done.
Loading package time-1.1.4 ... linking ... done.
Loading package random-1.0.0.2 ... linking ... done.
Loading package transformers-0.2.2.0 ... linking ... done.
Loading package mtl-2.0.1.0 ... linking ... done.
Loading package MonadRandom-0.1.6 ... linking ... done.
[True,False,False,True,True]
[False,True,False,True,False]

Update: D'oh, behaves different when compiled, so I'm eating my words:

botanix:~ stolz$ ./a.out 
[False,True,True,False,True]
Stack space overflow: current size 8388608 bytes.
Volker Stolz
  • 7,274
  • 1
  • 32
  • 50