3

I'm experimenting with recursion:

def fac
//fac = { int curr, res = 1G -> 1 >= curr ? res : fac( curr - 1, res * curr ) }
fac = { int curr, res = 1G -> 1 >= curr ? res : fac.trampoline( curr - 1, res * curr ) }
fac = fac.trampoline()

def rnd = new Random()

long s = System.currentTimeMillis()

100000.times{ fac rnd.nextInt( 40 ) }

println "done in ${System.currentTimeMillis() - s} ms / ${fac(40)}"

If I use it like this, I'm getting this:

done in 691 ms

If I uncomment line #2 and comment lines #3-4 to remove trampoline() and run it, I'm getting significantly lower numbers:

done in 335 ms

So, with trampoline the recursion works 2 times slower.

What am I missing?

P.S.

If I run the same example in Scala 2.12:

def fac( curr:Int, acc:BigInt = 1 ):BigInt = if( 1 >= curr ) acc else fac( curr - 1, curr * acc )
val s = System.currentTimeMillis
for( ix <- 0 until 100000 ) fac( scala.util.Random.nextInt(40).toInt )

println( s"done in ${System.currentTimeMillis - s} ms" )

it executes a bit faster:

done in 178 ms

UPDATE

Rewriting the closure to a method with the annotation:

@groovy.transform.TailRecursive
def fac( int curr, res = 1G ) { 1 >= curr ? res : fac( curr - 1, res * curr ) }
// the rest

gives

done in 164 ms

and is super-coll. Nevertheless, I still want to know about trampoline() :)

Szymon Stepniak
  • 40,216
  • 10
  • 104
  • 131
injecteer
  • 20,038
  • 4
  • 45
  • 89
  • 3
    Testing your code gives me roughly the same numbers. A big jump in performance is gained by writing `fac` out as a proper function. Then with `@groovy.transform.TailRecursive` it gets again a bit faster (quite close actually to the scala versions). – cfrick Jun 13 '19 at 11:17
  • 1
    @cfrick see the update... – injecteer Jun 13 '19 at 11:24
  • I guess because trampoline creates more closures to avoid the stackoverflow...but that's just a guess – tim_yates Jun 13 '19 at 13:28
  • adding `@Memoized` eats up another 10 ms of exec time – injecteer Jun 13 '19 at 14:15

1 Answers1

6

As stated in the documentation, Closure.trampoline() prevents from overflowing the call stack.

Recursive algorithms are often restricted by a physical limit: the maximum stack height. For example, if you call a method that recursively calls itself too deep, you will eventually receive a StackOverflowException.

An approach that helps in those situations is by using Closure and its trampoline capability.

Closures are wrapped in a TrampolineClosure. Upon calling, a trampolined Closure will call the original Closure waiting for its result. If the outcome of the call is another instance of a TrampolineClosure, created perhaps as a result to a call to the trampoline() method, the Closure will again be invoked. This repetitive invocation of returned trampolined Closures instances will continue until a value other than a trampolined Closure is returned. That value will become the final result of the trampoline. That way, calls are made serially, rather than filling the stack.


Source: http://groovy-lang.org/closures.html#_trampoline

However, using trampoline comes with a cost. Let's take a look at the JVisualVM samples.

Non-trampoline use case

Running an example without trampoline() we get a result in ~441 ms

done in 441 ms / 815915283247897734345611269596115894272000000000

This execution allocates ~2,927,550 objects and consumes around 100 MB of memory.

enter image description here

The CPU has a little to do, and except spending time on main() and run() methods, it spends some cycles on coercing arguments.

enter image description here

The trampoline() use case

Introducing the trampoline does change a lot. Firstly, it makes execution time almost two times slower compared to the previous attempt.

done in 856 ms / 815915283247897734345611269596115894272000000000

Secondly, it allocates ~5,931,470 (!!!) objects and consumes ~221 MB of memory. The main difference is that in the previous case a single of $_main_closure1 was used across all executions, and in case of using trampoline - every call to trampoline() method creates:

  • a new $_main_closure1 object
  • which gets wrapped with the CurriedClosure<T>
  • which then gets wrapped with the TrampolineClosure<T>

Only this allocates more than 1,200,000 objects.

enter image description here

If it comes to the CPU, it also has much more things to do. Just look at the numbers:

  • all calls to TrampolineClosure<T>.<init>() consume 199 ms
  • using trampoline introduces calls to PojoeMetaMethodSite$PojoCachedMethodSietNoUnwrap.invoke() which in total consume additional 201 ms
  • all calls to CachedClass$3.initValue() consume in total additional 98.8 ms
  • all calls to ClosureMetaClass$NormalMethodChooser.chooseMethod() consume in total additional 100 ms

enter image description here

And this is exactly why introducing trampoline in your case makes the code execution much slower.

So why @TailRecursive does much better?

In short - @TailRecursive annotation replaces all closures and recursive calls with good old while-loop. The factorial function with @TailRecursive looks something like this at the bytecode level:

//
// Source code recreated from a .class file by IntelliJ IDEA
// (powered by Fernflower decompiler)
//

package factorial;

import groovy.lang.GroovyObject;
import groovy.lang.MetaClass;
import java.math.BigInteger;
import org.codehaus.groovy.runtime.ScriptBytecodeAdapter;
import org.codehaus.groovy.runtime.dgmimpl.NumberNumberMultiply;
import org.codehaus.groovy.transform.tailrec.GotoRecurHereException;

public class Groovy implements GroovyObject {
    public Groovy() {
        MetaClass var1 = this.$getStaticMetaClass();
        this.metaClass = var1;
    }

    public static BigInteger factorial(int number, BigInteger acc) {
        BigInteger _acc_ = acc;
        int _number_ = number;

        try {
            while(true) {
                try {
                    while(_number_ != 1) {
                        int __number__ = _number_;
                        int var7 = _number_ - 1;
                        _number_ = var7;
                        Number var8 = NumberNumberMultiply.multiply(__number__, _acc_);
                        _acc_ = (BigInteger)ScriptBytecodeAdapter.castToType(var8, BigInteger.class);
                    }

                    BigInteger var4 = _acc_;
                    return var4;
                } catch (GotoRecurHereException var13) {
                    ;
                }
            }
        } finally {
            ;
        }
    }

    public static BigInteger factorial(int number) {
        return factorial(number, (BigInteger)ScriptBytecodeAdapter.castToType(1, BigInteger.class));
    }
}

I have documented this use case on my blog some time ago. You can read the blog post if you want to get more information:

https://e.printstacktrace.blog/tail-recursive-methods-in-groovy/

Community
  • 1
  • 1
Szymon Stepniak
  • 40,216
  • 10
  • 104
  • 131