5

In terms of under the hood: stack/heap allocation, garbage collection, resources and performance, what is the difference between the following three:

def Do1(a:String) = { (b:String) => { println(a,b) }}
def Do2(a:String)(b:String) = { println(a,b) }
def Do3(a:String, b:String) = { println(a,b) }

Do1("a")("b")
Do2("a")("b")
(Do3("a", _:String))("b")

Except the obvious surface differences in declaration about how much arguments each takes and returns

Alex
  • 11,479
  • 6
  • 28
  • 50
  • Well, this has nothing to do with currying. In any case, the question was about under the hoods. – Alex Aug 25 '15 at 10:52
  • Sorry, I overlooked the an important part of your question: how it relates to memory allocation. I still think there's a lot of relevant info in the related question, but it's not a duplicate, you're right. – Silly Freak Aug 25 '15 at 10:58
  • 1
    I would guess on a bytecode level that `Do2` and `Do3` are the same. Your call to `Do2` is likely a simple method call, whereas I'd expect the `Do3` call to create an intermediate function object. `Do1` should do that anyway. Maybe someone with more time right now could do a `javap` and write it up as an answer. – Silly Freak Aug 25 '15 at 11:02

1 Answers1

2

Decompiling the following class (note the additional call to Do2 compared to your question):

class Test {
  def Do1(a: String) = { (b: String) => { println(a, b) } }
  def Do2(a: String)(b: String) = { println(a, b) }
  def Do3(a: String, b: String) = { println(a, b) }

  Do1("a")("b")
  Do2("a")("b")
  (Do2("a") _)("b")
  (Do3("a", _: String))("b")
}

yields this pure Java code:

public class Test {
    public Function1<String, BoxedUnit> Do1(final String a) {
        new AbstractFunction1() {
            public final void apply(String b) {
                Predef..MODULE$.println(new Tuple2(a, b));
            }
        };
    }

    public void Do2(String a, String b) {
        Predef..MODULE$.println(new Tuple2(a, b));
    }

    public void Do3(String a, String b) {
        Predef..MODULE$.println(new Tuple2(a, b));
    }

    public Test() {
        Do1("a").apply("b");
        Do2("a", "b");
        new AbstractFunction1() {
            public final void apply(String b) {
                Test.this.Do2("a", b);
            }
        }.apply("b");
        new AbstractFunction1() {
            public final void apply(String x$1) {
                Test.this.Do3("a", x$1);
            }
        }.apply("b");
    }
}

(this code doesn't compile, but it suffices for analysis)


Let's look at it part by part (Scala & Java in each listing):

def Do1(a: String) = { (b: String) => { println(a, b) } }

public Function1<String, BoxedUnit> Do1(final String a) {
    new AbstractFunction1() {
        public final void apply(String b) {
            Predef.MODULE$.println(new Tuple2(a, b));
        }
    };
}

No matter how Do1 is called, a new Function object is created.


def Do2(a: String)(b: String) = { println(a, b) }

public void Do2(String a, String b) {
    Predef.MODULE$.println(new Tuple2(a, b));
}

def Do3(a: String, b: String) = { println(a, b) }

public void Do3(String a, String b) {
    Predef.MODULE$.println(new Tuple2(a, b));
}

Do2 and Do3 compile down to the same bytecode. The difference is exclusively in the @ScalaSignature annotation.


Do1("a")("b")

Do1("a").apply("b");

Do1 is straight-forward: the returned function is immediately applied.

Do2("a")("b")

Do2("a", "b");

With Do2, the compiler sees that this is not a partial application, and compiles it to a single method invocation.


(Do2("a") _)("b")

new AbstractFunction1() {
    public final void apply(String b) {
        Test.this.Do2("a", b);
    }
}.apply("b");

(Do3("a", _: String))("b")

new AbstractFunction1() {
    public final void apply(String x$1) {
        Test.this.Do3("a", x$1);
    }
}.apply("b");

Here, Do2 and Do3 are first partially applied, then the returned functions are immediately applied.


Conclusion:

I would say that Do2 and Do3 are mostly equivalent in the generated bytecode. A full application results in a simple, cheap method call. Partial application generates anonymous Function classes at the caller. What variant you use depends mostly on what intent you're trying to communicate.

Do1 always creates an immediate function object, but does so in the called code. If you expect to do partial applications of the function a lot, the using this variant will reduce your code size, and maybe trigger the JIT-Compiler earlier, because the same code is called more often. Full application will be slower, at least before the JIT-Compiler inlines and subsequently eliminates object creations at individual call sites. I'm not an expert on this, so I don't know whether you can expect that kind of optimization. My best guess would be that you can, for pure functions.

Silly Freak
  • 4,061
  • 1
  • 36
  • 58
  • Thanks a lot for the detailed analysis and especially the conclusion. Appreciate the time and effort. – Alex Aug 26 '15 at 10:17
  • @Alex You're welcome! Sorry again for missing the point of your question at first. Answering it was very interesting! – Silly Freak Aug 26 '15 at 10:48