2

As mentioned in Why does (sum $ takeWhile (<10000000) [1..]) use so much memory? the following does not blow up the memory in ghci :

foldl' (+) 0 $ takeWhile (< 10000000) [1 .. ]

However if I create a file containing :

import Data.List

longList::[Int]
longList = [1 .. ]

result :: Int
result = foldl' (+) 0 $ takeWhile (< 10000000) longList

main = do
  print $ result

and load into ghci, then upon running the program the memory consumption blows up. Why is this, and what can I do to fix the program? I am using ghc 7.8.3.

[EDIT]

It does not seem to blow up provided I compile first via ghc Test.hs. But if I remove all the .hi and .o files, and load into ghci via ghci Test.hs then the memory does blow up.

Community
  • 1
  • 1
artella
  • 5,068
  • 4
  • 27
  • 35
  • I'm not experiencing memory blowup for this program neither in ghci nor when compiled and run. I'm using `7.6.3`. Is it a compiler bug ? Also you may want to start your `ghci` using `ghci -fobject-code` as stated [here](http://www.haskell.org/haskellwiki/Memory_leak#A_note_on_GHCi). – Sibi Jul 19 '14 at 09:59
  • @Sibi I can reproduce this in `7.6.3`. If you remove all the `.hi` and `.o` files, and then load into ghci via `ghci Test.hs` does the program blow up for you? Thanks – artella Jul 19 '14 at 10:09
  • @Sibi, strangely if I compile first via `ghc Test.hs` and then load into ghci it does not blow up – artella Jul 19 '14 at 10:11
  • I can see some memory blowup if I remove them and test it again using `ghci`. I guess it's best to test these condition not in `ghci`. – Sibi Jul 19 '14 at 10:26
  • I rolled back to the previous version. We don't put answers in questions on Stack Overflow, because unlike a lot of forums, you don't have to wade through a sea of "me too" and "has anyone solved this" and suchlike to find the answer that worked. The answer that worked best for the asker is pulled to the top by the green tick. – AndrewC Jul 19 '14 at 14:51

2 Answers2

7

I believe this is due to the different treatment of the identifier longList when you :l the file in GHCi, as opposed to when it is compiled.

When you :l ModuleName in GHCi, by default all top-level identifiers in the module come into scope, so that you may debug it efficiently. For your example, this includes longList. This means that GHCi keeps around the content of longList after it has been evaluated, which gives a memory leak. I suspect this is the case even with -fobjectcode, so I am not sure the behavior discussed in the other comments actually is a bug.

When on the contrary you compile the module, GHC uses the module export list to find out which identifiers are exposed in the result. Since you have no explicit module declaration, it defaults to (last paragraph)

module Main (main) where

This means that when compiling, GHC can note that all identifiers except main are not exposed, and that longList is only used once. It can then drop keeping its value around, avoiding the memory leak.

Ørjan Johansen
  • 18,119
  • 3
  • 43
  • 53
  • Johansen : Fantastic thanks. Adding `module Main (main) where` and then loading into ghci solved all the problems!:) – artella Jul 19 '14 at 13:32
  • @artella Glad it helped, although now *I* am confused because I just said that should be the default behavior! – Ørjan Johansen Jul 19 '14 at 14:18
  • 2
    Now I am finding that I have to use both `module Main (main) where` and `ghci -fobject-code Test` (i.e. a mix of your answer and Sibi's answer). I think that when I wrote the comment above I had already made a bashrc alias using `-fobject-code` and didn't realise that it was needed. – artella Jul 19 '14 at 15:29
  • nomeata confirmed this is expected behaviour in the ticket https://ghc.haskell.org/trac/ghc/ticket/9332 – artella Jul 19 '14 at 19:02
  • @artella OK that makes more sense, although I still don't think `module Main (main) where` should have made a difference. I commented on the trac. – Ørjan Johansen Jul 19 '14 at 23:05
  • @ØrjanJohansen Even if it keeps the reference to `longList`, I don't think so it should make any difference. `result` is supposed to operate at constant memory, even if there is an reference to `longList`, isn't it ? – Sibi Jul 20 '14 at 06:50
  • Orjan & Sibi : @Sibi If I split the code up into two separate files (see code in https://ghc.haskell.org/trac/ghc/ticket/9332#comment:10) then it seems that no combination of fiddling can stop it from blowing up the memory. – artella Jul 20 '14 at 09:07
  • @artella Does it blow up when it is compiled and run ? – Sibi Jul 20 '14 at 09:14
  • @Sibi : No, it only blows up when it is run through ghci via `ghci -fobject-code Main.hs` – artella Jul 20 '14 at 09:27
  • @artella What if you add `-O2` as well? – Ørjan Johansen Jul 20 '14 at 11:23
  • @ØrjanJohansen The code in https://ghc.haskell.org/trac/ghc/ticket/9332#comment:10 is fine with `-O2`, and even fine with `-O0`. It is only when it is run in ghci that it blows. – artella Jul 20 '14 at 11:27
  • @Sibi The "normal" evaluation of `result` requires it to evaluate the first 10000000 elements of `longList` as well, after which the thunk of `longList` will have to be updated to contain those 10000000 elements too. Only if `longList` can be garbage collected immediately because nothing else references it, or even better inlined into `result` and fused away, will `result` have a chance of not leaking memory into `longList`. (Technically I guess GHC *could* deduce that `longList` is so cheap to calculate that it doesn't need to keep it around, but I don't think it does.) – Ørjan Johansen Jul 20 '14 at 11:33
  • @artella I meant `-O2` as a flag to ghci, I'd expect it would do something when you also have `-fobject-code`. Although I guess if `-O0` doesn't show the problem... – Ørjan Johansen Jul 20 '14 at 11:35
  • @ØrjanJohansen Thanks for explaining. It makes much sense now. – Sibi Jul 20 '14 at 11:46
  • @ØrjanJohansen You mean `ghci -fobject-code -O2 Main.hs` ? I tried this and it still blows up. Thanks – artella Jul 20 '14 at 11:59
2

See the section on note on GHCI:

If you are noticing a space leak while running your code within GHCi, please note that interpreted code behaves differently from compiled code: even when using seq.

Consider starting ghci as follows:

$ ghci -fobject-code

Sibi
  • 47,472
  • 16
  • 95
  • 163
  • This does not work for me. If I remove all `.hi` and `.o` files and then load via `ghci -fobject-code Test.hs` the memory still blows up on linux. – artella Jul 19 '14 at 10:20
  • 2
    @artella Yes, with `-fobject-code` I can also see some memory blowup. The takeaway here is when using `seq`, it is best to test it with compiling it to executable and then running it. – Sibi Jul 19 '14 at 10:24
  • it is very strange because using `-fobject-code` creates the `.hi` files and `.o` files just like compiling via `ghc Test.hs`. However with the former the memory blows up, but if I do latter and then load into ghci the memory does not blow up. I can only guess that the files are generated in different ways – artella Jul 19 '14 at 10:25
  • 1
    @artella But still this should not happen when using `-fobject-code`. I'm not sure what is causing the problem. Probably file a bug report since you can reproduce this on `7.8.3` also ? – Sibi Jul 19 '14 at 10:28
  • Yeah will file a bug. I can reproduce this in `7.6.3`, `7.8.2` and `7.8.3`. Thanks – artella Jul 19 '14 at 10:30
  • 1
    In the end I found I had to use a combination of your answer and Orjan's answer above in order to stop it blowing up in ghc 7.8.3. Thanks. – artella Jul 19 '14 at 15:36
  • There is an explanation from nomeata at https://ghc.haskell.org/trac/ghc/ticket/9332 in regards to the above behaviour. – artella Jul 19 '14 at 19:06