6

I am currently reading the O'Reilly Java 8 Lambdas, it is a really good book. I came across with a example like this.

I have a

private final BiFunction<StringBuilder,String,StringBuilder>accumulator=
(builder,name)->{if(builder.length()>0)builder.append(",");builder.append("Mister:").append(name);return builder;};

final Stream<String>stringStream = Stream.of("John Lennon","Paul Mccartney"
,"George Harrison","Ringo Starr");
final StringBuilder reduce = stringStream
    .filter(a->a!=null)
    .reduce(new StringBuilder(),accumulator,(left,right)->left.append(right));
 System.out.println(reduce);
 System.out.println(reduce.length());

this produce the right output.

Mister:John Lennon,Mister:Paul Mccartney,Mister:George Harrison,Mister:Ringo Starr

My question is regarding the reduce method the last parameter which is a BinaryOperator.

Which this parameter is used for? If I change by

.reduce(new StringBuilder(),accumulator,(left,right)->new StringBuilder());

the output is the same; if I pass NULL then N.P.E is returned.

What is this parameter used for?

Update

Why if I run it on parallelStream I am receiving different results?

First run:

returned StringBuilder length = 420

Second run:

returned StringBuilder length = 546

Third run:

returned StringBuilder length = 348

and so on. Why is this - should it not return all the values at each iteration?

halfer
  • 19,824
  • 17
  • 99
  • 186
chiperortiz
  • 4,751
  • 9
  • 45
  • 79

2 Answers2

16

The method reduce in the interface Stream is overloaded. The parameters for the method with three arguments are:

  • identity
  • accumulator
  • combiner

The combiner supports parallel execution. Apparently, it is not used for sequential streams. However, there is no such guarantee. If you change your streams into parallel stream, I guess you will see a difference:

Stream<String>stringStream = Stream.of(
    "John Lennon", "Paul Mccartney", "George Harrison", "Ringo Starr")
    .parallel();

Here is an example of how the combiner can be used to transform a sequential reduction into a reduction, that supports parallel execution. There is a stream with four Strings and acc is used as an abbreviation for accumulator.apply. Then the result of the reduction can be computed as follows:

acc(acc(acc(acc(identity, "one"), "two"), "three"), "four");

With a compatible combiner, the above expression can be transformed into the following expression. Now it is possible to execute the two sub-expressions in different threads.

combiner.apply(
    acc(acc(identity, "one"), "two"),
    acc(acc(identity, "three"), "four"));

Regarding your second question, I use a simplified accumulator to explain the problem:

BiFunction<StringBuilder,String,StringBuilder> accumulator =
    (builder,name) -> builder.append(name);

According to the Javadoc for Stream::reduce, the accumulator has to be associative. In this case, that would imply, that the following two expressions return the same result:

acc(acc(acc(identity, "one"), "two"), "three")  
acc(acc(identity, "one"), acc(acc(identity, "two"), "three"))

That's not true for the above accumulator. The problem is, that you are mutating the object referenced by identity. That's a bad idea for the reduce operation. Here are two alternative implementations which should work:

// identity = ""
BiFunction<String,String,String> accumulator = String::concat;

// identity = null
BiFunction<StringBuilder,String,StringBuilder> accumulator =
    (builder,name) -> builder == null
        ? new StringBulder(name) : builder.append(name);
nosid
  • 48,932
  • 13
  • 112
  • 139
  • thanks nosid i have a question why i am receiving differents results on every iteration i guess is for the parallelization... why is that severals results using the name code?? please see my edited question. – chiperortiz Jun 01 '14 at 16:00
  • @chiperortiz: I have updated my answer regarding your second question. Does the example really originate from the book? In this case, the phrase _good book_ seems questionable. – nosid Jun 01 '14 at 16:39
  • in the book is a more complex example and the use only sequential stream. – chiperortiz Jun 01 '14 at 17:04
  • i will use you BiFunction but should i use the same BinaryOperator? – chiperortiz Jun 01 '14 at 17:06
3

nosid's answer got it mostly right (+1) but I wanted to amplify a particular point.

The identity parameter to reduce must be an identity value. It's ok if it's an object, but if it is, it should immutable. If the "identity" object is mutated, it's no longer an identity! For more discussion of this point, see my answer to a related question.

It looks like this example originated from Example 5-19 of Richard Warburton, Java 8 Lambdas, O'Reilly 2014. If so, I shall have to have a word about this with the good Dr. Warburton.

Community
  • 1
  • 1
Stuart Marks
  • 127,867
  • 37
  • 205
  • 259
  • 2
    Similarly, the BinaryOperator parameter to reduce must be *associative*. Otherwise you will get gibberish results in parallel. – Brian Goetz Jun 02 '14 at 03:45
  • the example by Richard is used in a sequential stream not parallel stream Stuart thanks by your reply... – chiperortiz Jun 02 '14 at 11:45
  • @chiperortiz Indeed the example is sequential, but it's improper for code to give correct results sequentially and incorrect results in parallel, especially in a book that's trying to explain this stuff. (Also, I suspect even the sequential code violates some restrictions, and it's just lucky that it happens to give the correct result.) – Stuart Marks Jun 02 '14 at 17:27