variant of pascal's triangle in haskell - problem with lazy evaluation

Question

To solve some problem I need to compute a variant of the pascal's triangle which is defined like this:

f(1,1) = 1, 
f(n,k) = f(n-1,k-1) + f(n-1,k) + 1 for 1 <= k < n, 
f(n,0) = 0,
f(n,n) = 2*f(n-1,n-1) + 1.

For n given I want to efficiently get the n-th line (f(n,1) .. f(n,n)). One further restriction: f(n,k) should be -1 if it would be >= 2^32.

My implementation:

next :: [Int64] -> [Int64]
next list@(x:_) = x+1 : takeWhile (/= -1) (nextRec list)

nextRec (a:rest@(b:_)) = boundAdd a b : nextRec rest
nextRec [a] = [boundAdd a a]

boundAdd x y
    | x < 0 || y < 0 = -1
    | x + y + 1 >= limit = -1
    | otherwise = (x+y+1)

-- start shoud be [1]
fLine d start = until ((== d) . head) next start

The problem: for very large numbers I get a stack overflow. Is there a way to force haskell to evaluate the whole list? It's clear that each line can't contain more elements than an upper bound, because they eventually become -1 and don't get stored and each line only depends on the previous one. Due to the lazy evaluation only the head of each line is computed until the last line needs it's second element and all the trunks along the way are stored... I have a very efficient implementation in c++ but I am really wondering if there is a way to get it done in haskell, too.

It would help to provide a complete runnable program. This is close, but it's not clear what's missing or what you mean by "very large numbers". — Jason Orendorff, Mar 08 '10 at 16:21
I need numbers close to 2^32. In a c++ implementation the algorithm runs in linear time and constant space, because one line has at most 33 elements (the rest is -1). — haselhorstk, Mar 09 '10 at 07:45
`f(n,n) = 2*f(n-1,k-1) + 1`. I assume there shouldn't be k in the right hand side. — sastanin, Mar 10 '10 at 15:27
Well, I finally got a Haskell version that's fairly quick: http://pastie.org/864145 . Then I translated it into C: http://pastie.org/864140 and the C is roughly 10x faster (8msec vs. 100msec on my machine to do `time triangle 800000 | grep user`.) — Jason Orendorff, Mar 10 '10 at 23:10
thx a lot! Would you mind explaining your haskell code a bit - the whole ST Array etc. stuff is new to me and perhaps other users are interested, too. Perhaps you could give a second answer to the question. What is the difference in performance between simply using strict lists and your monad + array solutions? — haselhorstk, Mar 14 '10 at 10:16

score 5 · Accepted Answer · answered Mar 08 '10 at 17:26

Works for me: What Haskell implementation are you using? A naive program to calculate this triangle works fine for me in GHC 6.10.4. I can print the 1000th row just fine:

nextRow :: [Integer] -> [Integer]
nextRow row = 0 : [a + b + 1 | (a, b) <- zip row (tail row ++ [last row])]

tri = iterate nextRow [0]

main = putStrLn $ show $ tri !! 1000               -- print 1000th row

I can even print the first 10 numbers in row 100000 without overflowing the stack. I'm not sure what's going wrong for you. The global name tri might be keeping the whole triangle of results alive, but even if it is, that seems relatively harmless.

How to force order of evaluation: You can force thunks to be evaluated in a certain order using the Prelude function seq (which is a magic function that can't be implemented in terms of Haskell's other basic features). If you tell Haskell to print a `seq` b, it first evaluates the thunk for a, then evaluates and prints b.

Note that seq is shallow: it only does enough evaluation to force a to no longer be a thunk. If a is of a tuple type, the result might still be a tuple of thunks. If it's a list, the result might be a cons cell having thunks for both the head and the tail.

It seems like you shouldn't need to do this for such a simple problem; a few thousand thunks shouldn't be too much for any reasonable implementation. But it would go like this:

-- Evaluate a whole list of thunks before calculating `result`.
-- This returns `result`.
seqList :: [b] -> a -> a
seqList lst result = foldr seq result lst

-- Exactly the same as `nextRow`, but compute every element of `row`
-- before calculating any element of the next row.
nextRow' :: [Integer] -> [Integer]
nextRow' row = row `seqList` nextRow row

tri = iterate nextRow' [0]

The fold in seqList basically expands to lst!!0 `seq` lst!!1 `seq` lst!!2 `seq` ... `seq` result.

This is much slower for me when printing just the first 10 elements of row 100,000. I think that's because it requires computing 99,999 complete rows of the triangle.

variant of pascal's triangle in haskell - problem with lazy evaluation

1 Answers1