5

I am trying to implement a simple dp algorithm in Haskell (this is for the Collatz conjecture problem from Project Euler); here is the equivalent c++:

map<int,int> a;
int solve(int x) {
  if (a.find(x) != a.end()) return a[x];
  return a[x] = 1 + /* recursive call */;
}

So the code that I wrote in Haskell ended up looking like this:

solve :: (Memo, Int) -> (Memo, Int)
solve (mem, x) =
 case Map.lookup x mem of
  Just l -> (mem, l)
  Nothing -> let (mem', l') = {- recursive call -}
                 mem'' = Map.insert x (1+l') mem'
             in (mem'', 1+l')

(I think that I'm just reimplementing a state monad here, but never mind that for the moment.) The code which calls solve tries to find the largest value it can give for a parameter at most K=1e6:

foldl'
 (\(mem,ss) k ->
   let (mem',x') = solve (mem, k)
   in (mem', (x', k):ss))
 (Map.singleton 1 1, [(1,1)]) [2..100000]

The code as written above fails with a stack overflow. This is to be expected, I understand, because it builds up a really large unevaluated thunk. So I tried using

x' `seq` (mem', (x',k):ss)

inside foldl', and it computes the right answer for K=1e5. But this fails for K=1e6 (stack overflow in 12 seconds). Then I tried using

mem'' `seq` l' `seq` (mem'', 1+l')

in the last line of solve, but this made no difference (stack overflow still). Then I tried using

mem'' `deepseq` l' `seq` (mem'', 1+l')

This is extremely slow, presumably because deepseq walks through the entire map mem'', making the algorithm's time complexity quadratic instead of n*log(n).

What is the correct way to implement this? I'm stuck because I can't figure out how to make the entire computation strict, and I am not quite certain which part of the computation gives the stack overflow, but I suspect it's the map. I could use, e.g., arrays, but I want to understand what I am doing wrong here.

Zero Piraeus
  • 56,143
  • 27
  • 150
  • 160
Kirill
  • 470
  • 3
  • 13

1 Answers1

6

The stack overflow will not go away while you use a 32-bit signed integer type. For some starting values, the chain leaves 32-bit range and enters an infinite loop of negative numbers. Use Integer or Int64 or Word64.

Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • Argh, my bad. It's five times slower than c++ with Integer/Int64, but doesn't fail. Never mind. Thanks. – Kirill Apr 10 '12 at 06:27
  • Your tuple doesn't look strict enough even though you're using `foldl'`, try adding some `!` strictness tags. Five times slower sounds pretty good for Haskell actually. – Jeff Burdges Apr 10 '12 at 13:43
  • 1
    @JeffBurdges With the `seq`s it's strict enough. `maximum` may be too lazy, though, if that's used to find the longest chain. But no strictness helps against infinite loops. – Daniel Fischer Apr 10 '12 at 13:51