There seems to be some impediment to efficient lazy ByteString
generation by recursion. To demonstrate this, the chosen task is to make a lazy random ByteString
. (Random number generation is just a reasonably meaningful operation, i.e. a placeholder for any other recursion that might be of interest.)
Here are two attempts to create a random lazy ByteString
of fixed length n
. They allocate huge amounts of heap. First some imports:
import qualified Data.ByteString.Lazy as BSL
import Data.Word8
import System.Random
Now the function that uses cons
:
lazyRandomByteString1 :: Int -> StdGen -> BSL.ByteString
lazyRandomByteString1 n g = fst3 $ iter (BSL.empty, n, g) where
fst3 (a, _, _) = a
iter (bs', n', g') =
if n' == 0 then (bs', 0, g')
else iter (w `BSL.cons` bs', n'-1, g'') where
(w, g'') = random g' :: (Word8, StdGen)
The same, just using unfoldr
is shorter, but almost as bad as the above:
lazyRandomByteString2 :: Int -> StdGen -> BSL.ByteString
lazyRandomByteString2 n g = BSL.unfoldr f (n, g) where
f :: (Int, StdGen) -> (Int, StdGen)
f (n', g') =
if n' == 0 then Nothing
else Just (w, (n'-1, g'')) where
(w, g'') = random g' :: (Word8, StdGen)
Within what's provided by Data.ByteString.Lazy
these are all the available options to create ByteStrings
by recursion.
Next, turn to Data.ByteString.Lazy.Builder
, it was built to build lazy ByteStrings
, surely this must be more efficient:
import Data.ByteString.Lazy.Builder (Builder, toLazyByteString, word8)
lazyRandomByteString3 :: Int -> StdGen -> BSL.ByteString
lazyRandomByteString3 n g = toLazyByteString builder where
builder :: Builder
builder = fst3 $ iter (mempty, n, g) where
fst3 (a, _, _) = a
iter :: (Builder, Int, StdGen) -> (Builder, Int, StdGen)
iter (b, n', g') =
if n' == 0 then (b, 0, g')
else iter (b <> (word8 w), n'-1, g'') where
(w, g'') = random g' :: (Word8, StdGen)
But it isn't.
Builder
really should be able to do this efficiently, shouldn't it? What is wrong with lazyRandomByteString3
?
The source code is on github.