27

I am in the process of learning Java 8 and I came across something that I find a bit strange.

Consider the following snippet:

private MyDaoClass myDao;

public void storeRelationships(Set<Relationship<ClassA, ClassB>> relationships) {
    RelationshipTransformer transformer = new RelationshipTransformerImpl();

    myDao.createRelationships(
            relationships.stream()
            .map((input) -> transformer.transformRelationship(input))
            .collect(Collectors.toSet())
    );
}

Basically, I need to map the input set called relationships to a different type in order to conform to the API of the DAO I'm using. For the conversion, I would like to use an existing RelationshipTransformerImpl class that I instantiate as a local variable.

Now, here's my question:

If I was to modify the above code as follows:

public void storeRelationships(Set<Relationship<ClassA, ClassB>> relationships) {
    RelationshipTransformer transformer = new RelationshipTransformerImpl();

    myDao.createRelationships(
            relationships.stream()
            .map((input) -> transformer.transformRelationship(input))
            .collect(Collectors.toSet())
    );

    transformer = null;  //setting the value of an effectively final variable
}

I would obviously get a compilation error, since the local variable transformer is no longer "effectively final". However, if replace the lambda with a method reference:

public void storeRelationships(Set<Relationship<ClassA, ClassB>> relationships) {
    RelationshipTransformer transformer = new RelationshipTransformerImpl();

    myDao.createRelationships(
            relationships.stream()
            .map(transformer::transformRelationship)
            .collect(Collectors.toSet())
    );

    transformer = null;  //setting the value of an effectively final variable
}

Then I no longer get a compilation error! Why does this happen? I thought the two ways to write the lambda expression should be equivalent, but there's clearly something more going on.

Naman
  • 27,789
  • 26
  • 218
  • 353
Emil D
  • 1,864
  • 4
  • 23
  • 40

3 Answers3

24

JLS 15.13.5 may hold the explanation:

The timing of method reference expression evaluation is more complex than that of lambda expressions (§15.27.4). When a method reference expression has an expression (rather than a type) preceding the :: separator, that subexpression is evaluated immediately. The result of evaluation is stored until the method of the corresponding functional interface type is invoked; at that point, the result is used as the target reference for the invocation. This means the expression preceding the :: separator is evaluated only when the program encounters the method reference expression, and is not re-evaluated on subsequent invocations on the functional interface type.

As I understand it, since in your case transformer is the expression preceding the :: separator, it is evaluated just once and stored. Since it doesn't have to be re-evaluated in order to invoke the referenced method, it doesn't matter that transformer is later assigned null.

Eran
  • 387,369
  • 54
  • 702
  • 768
  • I don't see why they couldn't do the same thing for the lambda case: capture the value of `transformer` (or any other variable) when the lambda is evaluated and store it in some internal `final` field. This is effectvely what the method reference must do. You can then restrict that variables can only be changed outside the body of the lambda expression. I feel like I'm missing something though. – Sotirios Delimanolis Jan 02 '15 at 21:48
  • 1
    @SotiriosDelimanolis the lambda is _not_ evaluated immediately; it is only the call site which is generated at compile time. The actual linkage only happens the first time the lambda is `invokedynamic`ed. – fge Jan 03 '15 at 00:32
  • @fge I did not mean _invoked_. The lambda expression is [evaluated](http://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.27.4) to produce an instance of the functional interface. The compiler (and then JVM) can simply close over the value of any `final` variable and store it locally, ie. in the instance. – Sotirios Delimanolis Jan 03 '15 at 00:42
  • 1
    @SotiriosDelimanolis I _do_ mean invoked. From what I understand, nothing guarantees that the lambda will be invoked at once. Which means that the compiler needs to ensure that the reference is still valid when the linkage is done. Evaluating the lambda only creates the call site, it does not link it. – fge Jan 03 '15 at 01:21
  • @fge A lambda is more or less a closure. The lambda expression is parsed/evaluated/whatever you want to call it to produce an instance of the target interface. This happens right before the instance (the value of the reference to that instance) is needed. When this evaluation happens, Java needs to copy any local (and instance/class, but let's ignore that) state as instance data within the interface implementation so as to be able to use it when the implemented method is invoked. – Sotirios Delimanolis Jan 03 '15 at 02:10
  • @fge This is similar to what happens when an anonymous class instance creation expression captures (closes over) a local variable. I don't see why it couldn't copy the value of the local variable at the time of evaluation/parsing/whatever. This is what happens with an instance method reference. – Sotirios Delimanolis Jan 03 '15 at 02:12
  • 1
    @SotiriosDelimanolis but lambdas behave differently; they are indy call sites, therefore they use a bootstrap method which is called only when the linkage is actually performed. This is _very_ different from what an anonymous class does. The only thing that the compiler does is generate this bootstrap method. – fge Jan 03 '15 at 02:34
  • @fge You're using these abstract terms and I don't know what you mean by them. What is a bootstrap method, what is linkage? _Concretely_. Right before the `invokedynamic` you're talking about is called, any values (variables) that need to be available to the lambda (the closure I was talking about) are pushed onto the stack and used by the bootstrap method that sets up the instance. – Sotirios Delimanolis Jan 03 '15 at 16:59
  • @fge The corresponding instance (typically, but implementation dependent and that's what I'm arguing about) stores these values in (hidden) instance fields. Every time it needs them, it evaluates them. It doesn't have to reach out of the lambda to get them. Because of this, I'm saying I don't see a reason why any local variables need to be `final` outside the context of the lambda body. Their value in the context of the lambda is set in stone at lambda bootstrap. – Sotirios Delimanolis Jan 03 '15 at 17:03
  • 1
    @SotiriosDelimanolis " What is a bootstrap method, what is linkage? Concretely" <-- concretely, this is JSR 292. While it does not cover it all, an example is the `java.lang.invoke` API. A _bootstrap method_ is what describes a _callsite_; it will be queried by `invokedynamic` to link the callsite to actual code. When the linkage is done, the bootstrap method gets out of the way, _unless_ the callsite changes. But in Java, it never happens since Java is statically typed. This is not the case in Scala for instance, where some methods can be called with different argumen types. – fge Jan 03 '15 at 17:09
  • @fge That was a rhetorical question. I understand the process. The point I'm trying to make is that the process should not prevent any local variables from changing outside the lambda expression and after it appears in the source code. Unless I'm missing something, it could be done the same way as for a method reference, for local variables. – Sotirios Delimanolis Jan 03 '15 at 17:19
  • 1
    There is no difference between captured variables and captured method reference targets at the byte code level. It is a *language design decision* that only effectively final variables can be captured. I don’t think that the comments section is an appropriate place to discuss language design decisions… – Holger Jan 05 '15 at 11:42
5

Wild guess but to me, here is what happens...

The compiler cannot assert that the created stream is synchronous at all; it sees this as a possible scenario:

  • create stream from relationships argument;
  • reaffect transformer;
  • stream unrolls.

What is generated at compile time is a call site; it is linked only when the stream unrolls.

In your first lambda, you refer to a local variable, but this variable is not part of the call site.

In the second lambda, since you use a method reference, it means the generated call site will have to keep a reference to the method, therefore the class instance holding that method. The fact that it was referred by a local variable which you change afterwards does not matter.

My two cents...

fge
  • 119,121
  • 33
  • 254
  • 329
5

In your first example, transformer is referenced every time the mapping function is called, so once for every relationship.

In your second example transformer is referenced only once, when transformer::transformRelationship is passed to map(). So it doesn't matter if it changes afterward.

Those are not "the two ways to write the lambda expression" but a lambda expression and a method reference, two distinct features of the language.

Naman
  • 27,789
  • 26
  • 218
  • 353
a better oliver
  • 26,330
  • 2
  • 58
  • 66