2

The following program does not explode when the executable (compiled via ghc -O0 Explode.hs) is run, but does explode when run in ghci (via either ghci Explode.hs or ghci -fobject-code Explode.hs) :

--Explode.hs
--Does not explode with : ghc -O0 Explode.hs
--Explodes with         : ghci Explode.hs
--Explodes with         : ghci -fobject-code Explode.hs
module Main (main) where
import Data.Int
import qualified Data.ByteString.Lazy as BL
import qualified Data.ByteString.Lazy.Char8 as  BLC

createStr :: Int64 -> String -> BL.ByteString
createStr num str = BL.take num $ BL.cycle $ BLC.pack str

main = do
  BLC.writeFile "results.txt" $ createStr 100000000 "abc\n"

Why does it explode in ghci and not with ghc -O0 Explode.hs, and how can I stop it from exploding in ghci? The methods I adopted in Memory blowing up for strict sum/strict foldl in ghci dont seem to work here. Thanks.

Community
  • 1
  • 1
artella
  • 5,068
  • 4
  • 27
  • 35
  • 1
    You can leave `-O0`: [`-O0` actually gets ignored if not used with another `-O*` flag](https://www.haskell.org/ghc/docs/latest/html/users_guide/options-optimise.html). – Zeta Jul 27 '14 at 12:03

1 Answers1

2

After inspecting the code of writeFile, it seems that it depends on the hPut function of Data.ByteString.Lazy:

-- | Outputs a 'ByteString' to the specified 'Handle'.
--
hPut :: Handle -> ByteString -> IO ()
hPut h cs = foldrChunks (\c rest -> S.hPut h c >> rest) (return ()) cs

hPut constructs the IO action that will print the lazy bytestring by applying a right fold of sorts over the chunks. The source for the foldrChunks function is:

-- | Consume the chunks of a lazy ByteString with a natural right fold.
foldrChunks :: (S.ByteString -> a -> a) -> a -> ByteString -> a
foldrChunks f z = go
  where go Empty        = z
        go (Chunk c cs) = f c (go cs)    

Looking at the code, it seems as if the "spine" of the lazy bytestring (but not the actual data in each chunk) will be forced before writing the first byte, because of how (>>) behaves for the IO monad.

In your example, the strict chunks composing your lazy bytestring are very small. This means a whole lot of them will be generated when foldrChunks "forces the spine" of the 100000000 character long lazy bytestring.

If this analysis is correct, then reducing the number of strict chunks by making them bigger would reduce memory usage. This variant of createStr that creates bigger chunks doesn't blow up for me in ghci:

createStr :: Int64 -> String -> BL.ByteString
createStr num str = BL.take num $ BL.cycle $ BLC.pack $ concat $ replicate 1000 $ str

(I'm not sure why the compiled example doesn't blow up.)

danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • I don't see why that should force the spine of the lazy bytestring. `(>>)` isn't strict in its second argument before executing the first. And `foldrChunks` would be rather useless if not lazy. – Ørjan Johansen Jul 27 '14 at 16:05
  • @danidiaz : Thanks for the detailed answer. The only thing I don't get is that if your argument holds, then it should also follow that `BLC.writeFile "results.txt" $ BL.cycle $ BLC.pack "abc\n"` should not write out anything to "results.txt", since an infinite spine will have to be forced. However one finds that if one runs such code and aborts it, then "results.txt" is actually populated with results. – artella Jul 27 '14 at 16:14
  • @artella After reading Ørjan's comment and your own, I fear my explanation is incorrect. You should unaccept it. – danidiaz Jul 27 '14 at 17:28
  • @Ørjan I think you are right about `(>>)`. My answer is wrong. – danidiaz Jul 27 '14 at 17:34
  • @ØrjanJohansen & danidiaz. After examining `writeFile` and `hPut`, I think the problem might be related to http://stackoverflow.com/questions/24986296/io-monadic-assign-operator-causing-ghci-to-explode-for-infinite-list. – artella Jul 27 '14 at 22:34