3

Is getLine lazy?

Say I have a very long line on the input. It's just a sequence of numbers. I only need to sum 3 first numbers. Will getLine be efficient and read only the first part of the line, or do I have to create my own function for lazy line reading, that would read characters one by one?

Will my implementation be efficient if I were to sum the whole line? (Will there be an overhead due to reading characters one by one?)

import Control.Applicative

main = do
    line <- getLine'
    print $ sum $ map read $ take 3 $ words line

getLine' :: IO String
getLine' = do
    c <- getChar
    if c == '\n' then return [] else (c:) <$> getLine'
Will Ness
  • 70,110
  • 9
  • 98
  • 181
Andrzej Gis
  • 13,706
  • 14
  • 86
  • 130
  • I think the library `getLine` and your `getLine'` are both equally strict. IO actions can not return lazily unless by leveraging some `unsafe` function -- this is referred to as "lazy IO" and must handled with some care, since the actual reading will start later due to laziness, which might cause some issues. Lazy IO is (in)famously hard to debug. You could however use a strict custom `get3Ints` which only reads the part of the string you need. – chi Aug 20 '17 at 22:30
  • 3
    `getLine` must be strict in order to be correct, and as chi says your `getLine'` behaves exactly the same. If `getLine` were non-strict, then pure computations you do later would cause IO, by realizing more characters from the lazy input. This would be a nightmare when you consider that other IO can also be going on, also reading from stdin: which characters go where will be extremely hard to figure out. – amalloy Aug 20 '17 at 23:22
  • 1
    Cf. [this answer](https://codereview.stackexchange.com/a/120037/16551) to see a bit more about `IO` and laziness. If you want a lazy `getLine`, it'll probably need to have a type similar to `IO (ListT IO Char)` where `data ListT m a = Nil | Cons a (m (ListT m a))` (You could also have `ListT IO String` instead of `ListT IO Char` if you read the input in chunks of a given length to be more efficient). – gallais Aug 21 '17 at 14:19
  • what in the world is the down-vote for? – Will Ness Aug 22 '17 at 07:25

1 Answers1

1

While getLine isn't lazy, getContents is, and it can be combined with functions like lines and words. Therefore, the following program will only read enough of stdin to get (up to) three integers from the first line and print their sum:

main :: IO ()
main = do contents <- getContents
          let lns = lines contents
              result = sum $ map read $ take 3 $ words $ head lns
          print (result :: Integer)

Note that, if you modify the program to access subsequent lines -- for example, if you added:

putStrLn $ take 80 $ lns !! 1

to the bottom of the program to print the first 80 characters of the second line, then the program would have to finish reading the first line (and so would hang for a bit between the last two lines of the program) before processing the first 80 characters of the second. In other words, this lazy line reading is only useful if you only need to read the first bit of the first line, if that wasn't obvious to you -- Haskell doesn't have any magic way to skip the rest of the first line to get to the second.

Finally, note that, for the above program, if there are fewer than three integers on the first line, it'll just sum those numbers and won't try to read past the first line (which I think is what you wanted). If you don't actually care about the line endings and just want to sum the first three numbers in the file, regardless of how they're divided up into lines, then you can break the contents up directly into words like so:

main = do contents <- getContents
          let result = sum $ map read $ take 3 $ words contents
          print (result :: Integer)
K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71