The fundamental problem is that getContents
is an instances of Lazy IO. This means that getContents
produces a thunk that can be evaluated like a normal Haskell value, and only does the relevant IO when it's forced.
contents
is a lazy list that putStr
tries to print, which forces the list and causes getContents
to read as much as it can. putStr
then prints everything that's forced, and continues trying to force the rest of the list until it hits []
. As getContents
can read more and more of the stream—the exact behavior depends on buffering—putStr
can print more and more of it immediately, giving you the behavior you see.
While this behavior is useful for very simple scripts, it ties in Haskell's evaluation order into observable effects—something it was never meant to do. This means that controlling exactly when parts of contents
get printed is awkward because you have to break the normal Haskell abstraction and understand exactly how things are getting evaluated.
This leads to some potentially unintuitive behavior. For example, if you try to get the length of the input—and actually use it—the list is forced before you get to printing it, giving you the behavior you want:
main = do
contents <- getContents
let n = length contents
print n
putStr contents
but if you move the print n
after the putStr
, you go back to the original behavior because n
does not get forced until after printing the input (even though n
still got defined before putStr
was used):
main = do
contents <- getContents
let n = length contents
putStr contents
print n
Normally, this sort of thing is not a problem because it won't change the behavior of your code (although it can affect performance). Lazy IO just brings it into the realm of correctness by piercing the abstraction layer.
This also gives us a hint on how we can fix your issue: we need some way of forcing contents
before printing it. As we saw, we can do this with length
because length
needs to traverse the whole list before computing its result. Instead of printing it, we can use seq
which forces the lefthand expression to be evaluated at the same time as the righthand one, but throws away the actual value:
main = do
contents <- getContents
let n = length contents
n `seq` putStr contents
At the same time, this is still a bit ugly because we're using length
just to traverse the list, not because we actually care about it. What we would really like is a function that just traverses the list enough to evaluate it, without doing anything else. Happily, this is exactly what deepseq
does (for many data structures, not just lists):
import Control.DeepSeq
import System.IO
main = do
contents <- getContents
contents `deepseq` putStr contents