0

I am trying to calculate results for each sub problem using @tailrec similar to how normal recursive solutions can produce solutions for each sub problem. Following is the example I worked on.

@tailrec
  def collatz(
      n: BigInt,
      acc: BigInt,
      fn: (BigInt, BigInt) => Unit
  ): BigInt = {
    fn(n, acc)
    if (n == 1) {
      acc
    } else if (n % 2 == 0) {
      collatz(n / 2, acc + 1, fn)
    } else {
      collatz(3 * n + 1, acc + 1, fn)
    }
  }

Here I am calculating the count of a number when it reaches 1 using Collatz Conjecture. Just for an example let us assume it for number 32

val n = BigInt("32")
    val c = collatz(n, 0, (num, acc) => {
      println("Num -> " + num + " " + " " + "Acc -> " + acc)
    })

I am getting the following output.

Num -> 32  Acc -> 0
Num -> 16  Acc -> 1
Num -> 8  Acc -> 2
Num -> 4  Acc -> 3
Num -> 2  Acc -> 4
Num -> 1  Acc -> 5

Normal recursive solution will return exact count for each number. For instance number 2 reaches 1 in 1 step. Thus each sub problem has exact solution but in a tailrec method only final result is computed correctly. The variable acc behaves exactly like a loop variable as expected.

How can I change the code that is tail call optimized at the same time I can get exact value to the each sub problem. In simple words, how can I attain Stack type of behavior for acc variable.

Also, one related question how large will be the overhead of lambda function fn for large values of n assuming println statement will not be used.

I am adding a recursive solution that can produce correct solution for the sub problem.

def collatz2(
      n: BigInt,
      fn: (BigInt, BigInt) => Unit
  ): BigInt = {

    val c: BigInt = if (n == 1) {
      0
    } else if (n % 2 == 0) {
      collatz2(n / 2, fn) + 1
    } else {
      collatz2(3 * n + 1, fn) + 1
    }
    fn(n, c)
    c
  }

It produces the following output.

Num -> 1  Acc -> 0
Num -> 2  Acc -> 1
Num -> 4  Acc -> 2
Num -> 8  Acc -> 3
Num -> 16  Acc -> 4
Num -> 32  Acc -> 5
Hariharan
  • 881
  • 1
  • 13
  • 25
  • it turns the recursive call into a loop. Sorry for tldr; a common idiom is to make a local def tailrec, where some args like `fn` from the outer def are fixed. Then the outdef just calls the tailrec f with initial values. – som-snytt Nov 13 '19 at 03:51
  • @som-snytt I understand we can make nested method to make default arguments work better. My main question is regarding how can I make tail recursive solution solve sub problem similar to a recursive solution. – Hariharan Nov 13 '19 at 03:59
  • 1
    You start by saying that you want "to find how [the] annotation `@tailrec` works", but your question is all about tail recursive code, not the annotation. The `@tailrec` annotation has no effect on the compiled code. It just issues an error if the specified routine is _not_ tail recursive. – jwvh Nov 13 '19 at 08:49
  • @jwvh In normal recursive function we can return 1 for base case and the recursive call returns result for the sub problem. Adding `@tailrec` annotation makes it only use `acc` variable. – Hariharan Nov 13 '19 at 15:23
  • @Hariharan, that is not correct. Adding `@tailrec` changes nothing in how the code is compiled or run. Its only purpose is to notify the developer, at compile time, if the annotated method is _not_ tail recursive. If the method _is_ tail recursive then the annotation does nothing. If the method is _not_ tail recursive then the annotation just halts the compilation. That is all it does. – jwvh Nov 13 '19 at 20:55
  • @jwvh I edited question to explain how recursive solution can produce answer to the sub problem that `@tailrec` solution cannot. At least I can't. – Hariharan Nov 13 '19 at 22:31
  • @Hariharan; Your question is a good one and the added clarification is helpful, but the question has nothing to do with the **annotation** because the **annotation** has no effect on your code and the results you are getting. Your question would be even clearer if you removed all `@tailrec` references. – jwvh Nov 13 '19 at 23:05
  • @jwvh I will edit the intro to add clarity but I guess `@tailrec` serves a purpose. Without the annotation the second solution very well works. The key problem with the first solution is we can't return expressions that compute on the return value of recursive call. Adding the accumulator variable removes the tail call but I can't get solution for the sub problem. Maybe you can give few suggestions on how to change it? – Hariharan Nov 13 '19 at 23:17

2 Answers2

2

You can't "attain Stack type of behavior" while using tail recursion (without using an explicit stack). The @tailrec annotation says that you aren't using the call stack and that it can be optimized away. You have to decide whether you want tail recursion or recursive subproblem solving. Some problems (e.g. binary search) lend themselves very well to tail recursion, while others (e.g. your collatz code) require a little more thought, and still others (e.g. DFS) rely on the call stack too much to benefit as much from tail recursion.

Brian McCutchon
  • 8,354
  • 3
  • 33
  • 45
2

I'm not sure I understood your question correctly. It sounds like you are asking us to write collatz2 so that it is tail recursive. I have rewritten it in two ways.

Although I have provided two solutions, they are really the same thing. One uses a List as a stack, where the head of the List is the top of the stack. The other uses the mutable.Stack data structure. Study the two solutions until you can see why they are both the same as collatz2 in the original question.

To make the program tail recursive, what we have to do is to simulate the effect of pushing values onto a stack, and then popping them off one by one. It is during the pop phase that we give the value for Acc. (For those who don't remember, Acc in Hariharan's parlance is the index of each term.)

import scala.collection.mutable

object CollatzCount {

  def main(args: Array[String]) = {
    val start = 32

    collatzFinalList(start, printer)

    collatzFinalStack(start, printer)

  }

  def collatzInnerList(n: Int, acc: List[Int]): List[Int] = {
    if (n == 1) n :: acc
    else if (n % 2 == 0) collatzInnerList(n/2, n :: acc )
    else collatzInnerList(3*n + 1, n :: acc )
  }

  def collatzFinalList(n: Int, fun: (Int, Int)=>Unit): Unit = {
    val acc = collatzInnerList(n, List())
    acc.foldLeft(0){ (ctr, e) =>
      fun(e, ctr)
      ctr + 1
    }
  }

  def collatzInnerStack(n: Int, stack: mutable.Stack[Int]): mutable.Stack[Int] = {
    if (n == 1) {
      stack.push(n)
      stack
    } else if (n % 2 == 0) {
      stack.push(n)
      collatzInnerStack(n/2, stack)
    } else {
      stack.push(n)
      collatzInnerStack(3*n + 1, stack)
    }
  }

  def popStack(ctr: Int, stack: mutable.Stack[Int], fun: (Int, Int)=>Unit): Unit = {
    if (stack.nonEmpty) {
      val popped = stack.pop
      fun(popped, ctr)
      popStack(ctr + 1, stack, fun)
    } else ()
  }


  def collatzFinalStack(n: Int, fun: (Int, Int) => Unit): Unit = {
    val stack = collatzInnerStack(n, mutable.Stack())
    popStack(0, stack, fun)
  }


  val printer = (x: Int, y: Int) => println("Num ->" + x + " " + " " + "Acc -> " + y)

}
Allen Han
  • 1,163
  • 7
  • 16
  • Stack solution simulates the call stack very well though it is a vast improvement over large call stack one, for very large values of n we need to store all of them in stack. Whether the current stack solution can be refactored without storing all the values? For reference the tail recursive solution does not store all the values only the current value is returned using lambda. I assume it will help it to run infinitely. – Hariharan Nov 14 '19 at 02:30
  • @Hariharan There is no way to avoid storing all the values, since you can't compute the correct value of Acc otherwise. The program is fundamentally limited by the heap space of the JVM, and there is no way to get around it without persisting the state of your stack. An example of a persistence solution is storing the state of the stack in a database. Please realize there is no way to refactor the program so that it can run forever. In the end, you are fundamentally limited by the amount of disk space your computer has anyway, even with a persistence solution. – Allen Han Nov 14 '19 at 02:35
  • @Hariharan Also, to repeat what I said in the answer, both answers are actually stack solutions. They do exactly the same thing as collatz2. – Allen Han Nov 14 '19 at 02:38
  • I understand we can't run infinitely. Even `n:BigInt` has to be stored in memory so at one point it will fail. – Hariharan Nov 14 '19 at 02:50
  • I tried your solution with `Map` instead of `Stack` then we can clear the unused `Map` after few iterations in the call back. It is better than the original solution. I also checked they use this approach in dynamic programming that we can do memoization of previously computed values. – Hariharan Nov 14 '19 at 03:13
  • @Hariharan I think I see what you mean. You want to discard some values so that you don't run out of memory. It is possible to do this with the List based solution. Just call `take` or `drop` on the List, depending on which part of the stack you want to keep. Since you are working directly with the List representation of the stack, this is possible. Then record the number of elements discarded. As long as you keep a running count of the discarded values, this should be doable. Thank you for correcting me. – Allen Han Nov 14 '19 at 03:46
  • @Hariharan I am not going to add to my solution to include the change you want since I am not sure which part of the stack you want to keep - either the top or the bottom. Probably, you want to keep the top of the stack. – Allen Han Nov 14 '19 at 03:48