4

Just doing some simple benchmark on Bytestring and String. The code load a files of 10,000,000 lines, each an integer; and then convert each of the strings into the integer. Turns out the Prelude.read is much slower than ByteString.readInt.

I am wondering what is the reason for the inefficiency. Meanwhile, I am also not sure which part of the profiling report corresponds to the time cost of loading files (The data file is about 75 MB).

Here is the code for the test:

import System.Environment
import System.IO
import qualified Data.ByteString.Lazy.Char8 as LC

main :: IO ()
main = do
  xs <- getArgs
  let file = xs !! 0

  inputIo <- readFile file
  let iIo = map readInt  . linesStr $ inputIo
  let sIo = sum iIo

  inputIoBs <- LC.readFile file
  let iIoBs = map readIntBs  . linesBs $ inputIoBs
  let sIoBs = sum iIoBs

  print [sIo, sIoBs]

linesStr = lines

linesBs  = LC.lines


readInt :: String -> Int
readInt x = read x :: Int

readIntBs :: LC.ByteString -> Int
readIntBs bs = case LC.readInt bs of
                Nothing -> error "Not an integer"
                Just (x, _) -> x

The code is compiled and executed as:

> ghc -o strO2 -O2  --make Str.hs -prof -auto-all -caf-all -rtsopts
> ./strO2  a.dat +RTS -K500M -p  

Note "a.dat" is at aforementioned format and about 75MB. The profiling result is:

       strO2 +RTS -K500M -p -RTS a.dat

    total time  =      116.41 secs   (116411 ticks @ 1000 us, 1 processor)
    total alloc = 117,350,372,624 bytes  (excludes profiling overheads)

COST CENTRE MODULE  %time %alloc

readInt     Main     86.9   74.6
main.iIo    Main      8.7    9.5
main        Main      2.9   13.5
main.iIoBs  Main      0.6    1.9


                                                        individual     inherited
COST CENTRE   MODULE                  no.     entries  %time %alloc   %time %alloc

MAIN          MAIN                     54           0    0.0    0.0   100.0  100.0
 main         Main                    109           0    2.9   13.5   100.0  100.0
  main.iIoBs  Main                    116           1    0.6    1.9     1.3    2.4
   readIntBs  Main                    118    10000000    0.7    0.5     0.7    0.5
  main.sIoBs  Main                    115           1    0.0    0.0     0.0    0.0
  main.sIo    Main                    113           1    0.2    0.0     0.2    0.0
  main.iIo    Main                    111           1    8.7    9.5    95.6   84.1
   readInt    Main                    114    10000000   86.9   74.6    86.9   74.6
  main.file   Main                    110           1    0.0    0.0     0.0    0.0
 CAF:main1    Main                    106           0    0.0    0.0     0.0    0.0
  main        Main                    108           1    0.0    0.0     0.0    0.0
 CAF:linesBs  Main                    105           0    0.0    0.0     0.0    0.0
  linesBs     Main                    117           1    0.0    0.0     0.0    0.0
 CAF:linesStr Main                    104           0    0.0    0.0     0.0    0.0
  linesStr    Main                    112           1    0.0    0.0     0.0    0.0
 CAF          GHC.Conc.Signal         100           0    0.0    0.0     0.0    0.0
 CAF          GHC.IO.Encoding          93           0    0.0    0.0     0.0    0.0
 CAF          GHC.IO.Encoding.Iconv    91           0    0.0    0.0     0.0    0.0
 CAF          GHC.IO.FD                86           0    0.0    0.0     0.0    0.0
 CAF          GHC.IO.Handle.FD         84           0    0.0    0.0     0.0    0.0
 CAF          Text.Read.Lex            70           0    0.0    0.0     0.0    0.0

Edit:

The input file "a.dat" are 10,000,000 lines of numbers:

1
2
3
...
10000000

Following the discussion I replaced "a.dat" by 10,000,000 lines of 1s, which does not affect the above performance observation:

1
1
...
1
Causality
  • 1,123
  • 1
  • 16
  • 28
  • My guess would be because bytestrings are far more efficient than strings, so the act of accessing them from memory is much faster. Strings are just lists of Chars, whose elements aren't necessarily in adjacent memory locations, while bytestrings are more like C char arrays, so the strings can be loaded from memory in efficient chunks. – bheklilr Oct 28 '13 at 02:43
  • I understand the different memory implementation for `String` and `ByteString`, but both mapping functions (`map . readInt` and `map . readIntBs`) take a `List` (either `[String]` or `[ByteString]`) as input, while the the element of the `List` seems minimum length (a sttring like "10,000,000" at most). But you could be right, I will test it. – Causality Oct 28 '13 at 02:53
  • Following the discussion, I did a test where all the lines are "1", so that it is either a length 10,000,000 `List` of "1" or that same length `List` of "1" in `ByteString` format. The performance observation still stands. – Causality Oct 28 '13 at 03:11

1 Answers1

6

read is doing a much harder job than readInt. For example, compare:

> map read ["(100)", " 100", "- 100"] :: [Int]
[100,100,-100]
> map readInt ["(100)", " 100", "- 100"]
[Nothing,Nothing,Nothing]

read is essentially parsing Haskell. Combined with the fact that it's consuming linked lists, it's no surprise at all that it's really very slow indeed.

Community
  • 1
  • 1
Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • right on. Is there other Prelude function does the simple conversion of [char] to Int/num similar to readInt did to ByteString? – Causality Oct 28 '13 at 04:12
  • BTW, could you point which part in the profiling corresponds to the cost of File I/O? – Causality Oct 28 '13 at 04:12
  • @Causality You might like the [`Numeric`](http://hackage.haskell.org/package/base-4.6.0.1/docs/Numeric.html) module. – Daniel Wagner Oct 28 '13 at 04:29