I found that the following Haskell code uses 100% CPU and takes about 14secs to finish on my Linux server.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString.Lazy.Char8 as L
import System.IO
str = L.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
main = do
hSetBuffering stdout (BlockBuffering (Just 1000))
sequence (take 1000000 (repeat (L.hPutStr stdout str >> hFlush stdout)))
return ()
On the other hand, very similar Python code finishes the same task in about 3secs.
import sys
str = "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
def main():
for i in xrange(0, 1000000):
print str,
sys.stdout.flush()
# doIO()
main()
By using strace, I found that select is called every time hFlush is called in Haskell version. On the other hand, select is not called in Python version. I guess this is one of the reason that Haskell version is slow.
Are there any way to improve performance of Haskell version?
I already tried to omit hFlush and it certainly decreased CPU usage a lot. But this solution is not satisfiable because it does not flush.
Thanks.
EDITED
Thank you very very much for your help! By changing sequence and repeat to replicateM_, runtime is reduced from 14s to 3.8s.
But now I have another question. I asked the above question because when I removed hFlush from the above program, it runs fast despite it repeats I/O using sequence and repeat.
Why only the combination of sequence and hFlush makes it slow?
To confirm my new question, I changed my program as follows to do profiling.
{-# LANGUAGE OverloadedStrings #-}
module Main where
import qualified Data.ByteString.Char8 as S
import System.IO
import Control.Monad
str = S.pack "FugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFugaFuga\n"
doIO = S.hPutStr stdout str >> hFlush stdout
doIO' = S.hPutStr stdout str >> hFlush stdout
doIOWithoutFlush = S.hPutStr stdout str
main = do
hSetBuffering stdout (BlockBuffering (Just 1000))
sequence (take 1000000 (repeat doIO))
replicateM_ 1000000 doIO'
sequence (take 1000000 (repeat doIOWithoutFlush))
return ()
By compiling and running as follows:
$ ghc -O2 -prof -fprof-auto Fuga.hs
$ ./Fuga +RTS -p -RTS > /dev/null
I got the following result.
COST CENTRE MODULE %time %alloc
doIO Main 74.7 35.8
doIO' Main 21.4 35.8
doIOWithoutFlush Main 2.6 21.4
main Main 1.3 6.9
What makes the difference between doIO and doIO' which do the same task? And why doIOWithoutFlush runs fast even in sequence and repeat? Are there any reference about this behavior?
Thanks.