9

I am currently learning Haskell, and there is one thing that baffles me:

When I build a complex expression (whose computation will take some time) and this expression is constant (meaning it is build only of known, hard coded values), the expression is not evaluated at compile time.

Comming from a C/C++ background I am used to such kind of optimization.

What is the reason to NOT perform such optimization (by default) in Haskell / GHC ? What are the advantages, if any?

data Tree a =
   EmptyTree
 | Node a (Tree a) (Tree a)
 deriving (Show, Read, Eq)

elementToTree :: a -> Tree a
elementToTree x = Node x EmptyTree EmptyTree

treeInsert :: (Ord a) => a -> Tree a -> Tree a
treeInsert x EmptyTree = elementToTree x
treeInsert x (Node a left right)
  | x == a = Node x left right
  | x < a  = Node a (treeInsert x left) right
  | x > a  = Node a left (treeInsert x right)

treeFromList :: (Ord a) => [a] -> Tree a
treeFromList []     = EmptyTree
treeFromList (x:xs) = treeInsert x (treeFromList xs)

treeElem :: (Ord a) => a -> Tree a -> Bool
treeElem x EmptyTree = False
treeElem x (Node a left right)
  | x == a = True
  | x < a  = treeElem x left
  | x > a  = treeElem x right

main = do
  let tree = treeFromList [0..90000]
  putStrLn $ show (treeElem 3 tree)

As this will always print True I would expect the compiled programm to print and exit almost immediately.

Daniel Jour
  • 15,896
  • 2
  • 36
  • 63
  • Remember that Haskell has lazy evaluation. This applies whether or not the expression is "constant". – C. K. Young Oct 08 '13 at 21:52
  • Yes I know. But what is the advantage of sticking to this (in this case unneccessary) laziness? – Daniel Jour Oct 08 '13 at 22:04
  • 2
    @ChrisJester-Young What if the compiler can guarantee the expression is evaluated due to strictness analysis? – Gabriella Gonzalez Oct 08 '13 at 22:07
  • 1
    @ChrisJester-Young Well, Haskell has "non-strict semantics". It's not allowed to fail to terminate on programs that lazy evaluation would allow to terminate, but it doesn't actually have to *use* lazy evaluation, and it doesn't even have to use the same evaluation strategy for all "constants". – Ben Mar 31 '14 at 01:53

3 Answers3

15

You may like this reddit thread. The compiler could try to do this, but it could be dangerous, as constants of any type can do funny things like loop. There are at least two solutions: one is supercompilation, not available as part of any compiler yet but you can try prototypes from various researchers; the more practical one is to use Template Haskell, which is GHC's mechanism for letting the programmer ask for some code to be run at compile time.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • Thank you very much for that link. Do you think that such compile time evaluations will be able to get part of a (non-prototype) compiler anytime soon? – Daniel Jour Oct 09 '13 at 06:35
  • @DanielOertwig I suspect that the GHC folks would be thrilled to include supercompilation patches -- if only somebody were to put in the effort required to do a good job on them! – Daniel Wagner Oct 09 '13 at 17:37
  • Unfortunately (for me) the compiler itself seems to be written in Haskell, which means hacking on its source won't be possible for me anytime soon :( – Daniel Jour Oct 09 '13 at 20:15
3

The process you are talking about is called supercompilation and it's more difficult than you make it out to be. It is actually one of the active research topics in computing science! There are some people that are trying to create such a supercompiler for Haskell (probably based on GHC, my memory is vague) but the feature is not included in GHC (yet) because the maintainers want to keep compilation times down. You mention C++ as a language that does this – C++ also happens to have notoriously bad compilation times!

Your alternative for Haskell is to do this optimisation manually with Template Haskell, which is Haskells compile-time evaluated macro system.

kqr
  • 14,791
  • 3
  • 41
  • 72
  • Of course this would (drastically) increase compilation times, but I don't see a problem there: As soon as the resulting binary is run more than one time you already saved yourself some time. As a side note: Compiling C++ takes time, but most of the time its not a problem of compiling the code rather than sorting out which code has to be compiled. Things like bloated headers and bad (=recursive) Makefiles increase the compilation time that much, not (most) compile time evaluations. – Daniel Jour Oct 09 '13 at 06:31
  • 1
    @DanielOertwig "As soon as the resulting binary is run more than one time, you already saved yourself some time" -- this operates under the somewhat questionable assumption that supercompilation takes no longer than evaluation. – Daniel Wagner Oct 10 '13 at 03:17
  • @DanielWagner well, for that case maintainers could leave for «supercompilation» an explicit option — if one thinks that it aren't worth for them, they could just don't use it. – Hi-Angel Jul 23 '15 at 10:16
2

In this case, GHC can not be sure that the computation would finish. It's not a question of lazy versus strict, but rather the halting problem. To you, it looks quite simple to say that treeFromlist [0..90000] is a constant that can be evaluated at compile time, but how does the compiler know this? The compiler can easily optimize [0..90000] to a constant, but you wouldn't even notice this change.

bheklilr
  • 53,530
  • 6
  • 107
  • 163
  • Ok, assuming a computation that never finishes, why not having the compiler hang instead of the runtime? Or in other words: Is there any usefull computation which doesn't halt? (except IO dependend things) – Daniel Jour Oct 09 '13 at 06:19
  • 1
    @DanielOertwig I don't want my compiler to hang trying to run my program. And yes, there are useful computations that either don't halt, take a very long time to halt, or would simply take up a lot of space in the bytecode. What if I had a tree, but I want more than just checking if it has an element? Should the compiler compute a very large tree that could take up potential gigabytes of hard drive space, just so it doesn't have to compute it at runtime? – bheklilr Oct 09 '13 at 12:24
  • 1
    @DanielOertwig This is particularly funny because in Haskell you have constant unfinishing computations quite often. For example, whenever `repeat` is used. Or `[1..]` – Ingo Oct 09 '13 at 13:48
  • @bheklilr Consider this: If - with supercompilation - your compiler hungs, than there MUST be an input which will make the resulting programm hang. Wouldn't you rather have the compiler hung than an possibly unnoticed bug in your release? What's the matter of computing a large tree at compile time? If it's large, then it is large, the only thing one could save is building it up at runtime. Also remember that supercompilation like this could only be done on data available at compile time, so the amount of data is bounded. – Daniel Jour Oct 09 '13 at 20:00
  • @Ingo That's a very good point. A naive implementation of reducing an AST such that nodes with only constant subnodes are reduced to a single constant would fail when using constructs like `[1..]`. But adding an exception to that reduction scheme would solve this. – Daniel Jour Oct 09 '13 at 20:04
  • @DanielOertwig One downside is executable side. If you embed a large tree in your executable, the file size will increase. Another downside is that if it takes me an hour to compile a simple program, I have to wait an hour between each time I want to test it, even if I'm only testing a small part of the application. Besides, if I need that much data as soon as I start the program, it'll probably get shoved off into a file which can usually be read faster than a complex computation. – bheklilr Oct 09 '13 at 20:05
  • @bheklilr You are right about the executable size. I also agree that testing code which is compiled with supercompilation would take far to much time, that's why I would assume a switch to deactivate supercompilation. With supercompilation you would have the data in a file (the executable itself) instead of having to perform the very same complex computations on every programm start. – Daniel Jour Oct 09 '13 at 20:19
  • 2
    @DanielOertwig But then we are back at the Halting Problem. You simply can't get away with just adding "an exception" for obvious cases like `[1..]`. Rather, for every constant expression, you must proove that it isn't infinite, i.e. computation halts. – Ingo Oct 09 '13 at 21:02