33

I'm writing a library for novice programmers so I'm trying to keep the API as clean as possible.

One of the things my Library needs to do is perform some complex computations on a large collection of ints or longs. There are lots of scenarios and business objects that my users need to compute these values from, so I thought the best way would be to use streams to allow users to map business objects to IntStream or LongStream and then compute the computations inside of a collector.

However IntStream and LongStream only have the 3 parameter collect method:

collect(Supplier<R> supplier, ObjIntConsumer<R> accumulator, BiConsumer<R,R> combiner)

And doesn't have the simplier collect(Collector) method that Stream<T> has.

So instead of being able to do

Collection<T> businessObjs = ...
MyResult result = businessObjs.stream()
                              .mapToInt( ... )
                              .collect( new MyComplexComputation(...));

I have to do provide Suppliers, accumulators and combiners like this:

MyResult result = businessObjs.stream()
                              .mapToInt( ... )
                              .collect( 
                                  ()-> new MyComplexComputationBuilder(...),
                                  (builder, v)-> builder.add(v),
                                  (a,b)-> a.merge(b))
                              .build(); //prev collect returns Builder object

This is way too complicated for my novice users and is very error prone.

My work around is to make static methods that take an IntStream or LongStream as input and hide the collector creation and execution for you

public static MyResult compute(IntStream stream, ...){
       return .collect( 
                        ()-> new MyComplexComputationBuilder(...),
                        (builder, v)-> builder.add(v),
                        (a,b)-> a.merge(b))
               .build();
}

But that doesn't follow the normal conventions of working with Streams:

IntStream tmpStream = businessObjs.stream()
                              .mapToInt( ... );

 MyResult result = MyUtil.compute(tmpStream, ...);

Because you have to either save a temp variable and pass that to the static method, or create the Stream inside the static call which may be confusing when it's is mixed in with the other parameters to my computation.

Is there a cleaner way to do this while still working with IntStream or LongStream ?

Vitaliy
  • 8,044
  • 7
  • 38
  • 66
dkatzel
  • 31,188
  • 3
  • 63
  • 67
  • Unfortunately, my advice would be using `Stream`. You can get it from `IntStream` by `mapToObj(Function.identity())`. – Dmitry Ginzburg May 18 '15 at 18:57
  • @DmitryGinzburg the Stream will potentially have many thousands of elements and the computation is pretty complex, I don't want to be penalized by all the boxing/unboxing – dkatzel May 18 '15 at 19:04
  • 1
    the compiler may be able to eliminate the boxing/unboxing if it can inline the code path from the conversion to your consumers. Just write them with *int*-based interfaces as you would with `IntStream` and see if it generates any garbage or not. – the8472 May 18 '15 at 19:11
  • 9
    @DmitryGinzburg, `IntStream#boxed()` provides the same functionality. – the8472 May 18 '15 at 19:15
  • @the8472 Oops, thank you for the note. – Dmitry Ginzburg May 18 '15 at 19:31
  • 1
    well, we shouldn't have too much faith in compiler optimization either. – ZhongYu May 18 '15 at 19:39
  • No faith required, as it is testable. – the8472 May 18 '15 at 19:51
  • 2
    good for you. To me, it is mostly faith based - I *believe* I can write it this way because the compiler can optimize it; I *believe* I shouldn't do it that way because the compiler is unlikely to optimize it. - I'm too lazy to test every permutation of implementation strategies. – ZhongYu May 18 '15 at 19:56
  • @bayou.io I posted some microbenchmarks which should help shed light on this issue – dkatzel May 19 '15 at 18:04

5 Answers5

27

We did in fact prototype some Collector.OfXxx specializations. What we found -- in addition to the obvious annoyance of more specialized types -- was that this was not really very useful without having a full complement of primitive-specialized collections (like Trove does, or GS-Collections, but which the JDK does not have). Without an IntArrayList, for example, a Collector.OfInt merely pushes the boxing somewhere else -- from the Collector to the container -- which no big win, and lots more API surface.

Brian Goetz
  • 90,105
  • 23
  • 150
  • 161
  • So you don't think there would be a performance difference? I was unaware there'd be any boxing if I used an `IntStream` – dkatzel May 18 '15 at 20:34
  • 11
    You're looking at the wrong end of the stream. An IntStream does no boxing when manipulating ints. But what result containers can you put ints into without boxing? Not an ArrayList, or HashSet, or ... Collector.OfInt is not useful without a rich set of int-friendly things to collect into. – Brian Goetz May 18 '15 at 20:41
  • I think my usecases will primarily use mapping functions that return ints like `stream.mapToInt(String::getLength)` there shouldn't be any boxing on that end either – dkatzel May 18 '15 at 20:48
  • 4
    @dkatzel Right, the whole point of IntStream and friends here is to be able to not box on every operation, and you do get that. If we had primitive-friendly collections, then Collector.OfXxx would make more sense, but we don't have those, so we stopped where the specialization yields practical value. – Brian Goetz May 18 '15 at 20:54
  • @Brian - Even though JDK has no primitive collections, others do, and they are used extensively by user code. Therefore I'm not quite convinced. – ZhongYu May 18 '15 at 21:20
  • 6
    @bayou.io You don't have to be convinced. The question was "why didn't they", and the answer was "we prototyped it, and concluded that the return-on-{effort,cost,complexity} wasn't there." – Brian Goetz May 18 '15 at 21:54
  • @BrianGoetz - of course, you are the man (of lambda/Stream). I'm just questioning a very specific point listed in your justification, but forget it. – ZhongYu May 18 '15 at 22:09
6

Perhaps if method references are used instead of lambdas, the code needed for the primitive stream collect will not seem as complicated.

MyResult result = businessObjs.stream()
                              .mapToInt( ... )
                              .collect( 
                                  MyComplexComputationBuilder::new,
                                  MyComplexComputationBuilder::add,
                                  MyComplexComputationBuilder::merge)
                              .build(); //prev collect returns Builder object

In Brian's definitive answer to this question, he mentions two other Java collection frameworks that do have primitive collections that actually can be used with the collect method on primitive streams. I thought it might be useful to illustrate some examples of how to use the primitive containers in these frameworks with primitive streams. The code below will also work with a parallel stream.

// Eclipse Collections
List<Integer> integers = Interval.oneTo(5).toList();

Assert.assertEquals(
        IntInterval.oneTo(5),
        integers.stream()
                .mapToInt(Integer::intValue)
                .collect(IntArrayList::new, IntArrayList::add, IntArrayList::addAll));

// Trove Collections

Assert.assertEquals(
        new TIntArrayList(IntStream.range(1, 6).toArray()),
        integers.stream()
                .mapToInt(Integer::intValue)
                .collect(TIntArrayList::new, TIntArrayList::add, TIntArrayList::addAll));

Note: I am a committer for Eclipse Collections.

Donald Raab
  • 6,458
  • 2
  • 36
  • 44
3

I've implemented the primitive collectors in my library StreamEx (since version 0.3.0). There are interfaces IntCollector, LongCollector and DoubleCollector which extend the Collector interface and specialized to work with primitives. There's an additional minor difference in combining procedure as methods like IntStream.collect accept a BiConsumer instead of BinaryOperator.

There is a bunch of predefined collection methods to join numbers to string, store to primitive array, to BitSet, find min, max, sum, calculate summary statistics, perform group-by and partition-by operations. Of course, you can define your own collectors. Here's several usage examples (assumed that you have int[] input array with input data).

Join numbers as string with separator:

String nums = IntStreamEx.of(input).collect(IntCollector.joining(","));

Grouping by last digit:

Map<Integer, int[]> groups = IntStreamEx.of(input)
      .collect(IntCollector.groupingBy(i -> i % 10));

Sum positive and negative numbers separately:

Map<Boolean, Integer> sums = IntStreamEx.of(input)
      .collect(IntCollector.partitioningBy(i -> i > 0, IntCollector.summing()));

Here's a simple benchmark which compares these collectors and usual object collectors.

Note that my library does not provide (and will not provide in future) any user-visible data structures like maps on primitives, so grouping is performed into usual HashMap. However if you are using Trove/GS/HFTC/whatever, it's not so difficult to write additional primitive collectors for the data structures defined in these libraries to gain more performance.

Tagir Valeev
  • 97,161
  • 19
  • 222
  • 334
2

Convert the primitive streams to boxed object streams if there are methods you're missing.

MyResult result = businessObjs.stream()
                          .mapToInt( ... )
                          .boxed()
                          .collect( new MyComplexComputation(...));

Or don't use the primitive streams in the first place and work with Integers the whole time.

MyResult result = businessObjs.stream()
                          .map( ... )     // map to Integer not int
                          .collect( new MyComplexComputation(...));
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • There is a specific reason to have primitive streams. Boxing it again or using a Boxed primitive itself beats that purpose. Otherwise there is another alternative with for-loops as well. – SijuMathew Dec 30 '21 at 09:17
1

Mr. Geotz provided the definitive answer for why the decision was made not to include specialized Collectors, however, I wanted to further investigate how much this decision affected performance.

I thought I would post my results as an answer.

I used the jmh microbenchmark framework to time how long it takes to compute calculations using both kinds of Collectors over collections of sizes 1, 100, 1000, 100,000 and 1 million:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Thread)
public class MyBenchmark {

@Param({"1", "100", "1000", "100000", "1000000"})
public int size;

List<BusinessObj> seqs;

@Setup
public void setup(){
    seqs = new ArrayList<BusinessObj>(size);
    Random rand = new Random();
    for(int i=0; i< size; i++){
        //these lengths are random but over 128 so no caching of Longs
        seqs.add(BusinessObjFactory.createOfRandomLength());
    }
}
@Benchmark
public double objectCollector() {       

    return seqs.stream()
                .map(BusinessObj::getLength)
                .collect(MyUtil.myCalcLongCollector())
                .getAsDouble();
}

@Benchmark
public double primitiveCollector() {

    LongStream stream= seqs.stream()
                                    .mapToLong(BusinessObj::getLength);
    return MyUtil.myCalc(stream)        
                        .getAsDouble();
}

public static void main(String[] args) throws RunnerException{
    Options opt = new OptionsBuilder()
                        .include(MyBenchmark.class.getSimpleName())
                        .build();

    new Runner(opt).run();
}

}

Here are the results:

# JMH 1.9.3 (released 4 days ago)
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_31.jdk/Contents/Home/jre/bin/java
# VM options: <none>
# Warmup: 20 iterations, 1 s each
# Measurement: 20 iterations, 1 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
# Benchmark: org.sample.MyBenchmark.objectCollector

# Run complete. Total time: 01:30:31

Benchmark                        (size)  Mode  Cnt          Score         Error  Units
MyBenchmark.objectCollector           1  avgt  200        140.803 ±       1.425  ns/op
MyBenchmark.objectCollector         100  avgt  200       5775.294 ±      67.871  ns/op
MyBenchmark.objectCollector        1000  avgt  200      70440.488 ±    1023.177  ns/op
MyBenchmark.objectCollector      100000  avgt  200   10292595.233 ±  101036.563  ns/op
MyBenchmark.objectCollector     1000000  avgt  200  100147057.376 ±  979662.707  ns/op
MyBenchmark.primitiveCollector        1  avgt  200        140.971 ±       1.382  ns/op
MyBenchmark.primitiveCollector      100  avgt  200       4654.527 ±      87.101  ns/op
MyBenchmark.primitiveCollector     1000  avgt  200      60929.398 ±    1127.517  ns/op
MyBenchmark.primitiveCollector   100000  avgt  200    9784655.013 ±  113339.448  ns/op
MyBenchmark.primitiveCollector  1000000  avgt  200   94822089.334 ± 1031475.051  ns/op

As you can see, the primitive Stream version is slightly faster, but even when there are 1 million elements in the collection, it is only 0.05 seconds faster (on average).

For my API I would rather keep to the cleaner Object Stream conventions and use the Boxed version since it is such a minor performance penalty.

Thanks to everyone who shed insight into this issue.

Community
  • 1
  • 1
dkatzel
  • 31,188
  • 3
  • 63
  • 67
  • 1
    That depends on the task you're solving. In your case probably your calculations are quite complex, so boxing overhead is not significant. I almost implemented primitive collectors in my library and performance boost can be quite significant from 30% on "grouping by last digit" task, to 2x on string joining and 5x on "sum by sign" task. See my benchmark and results [here](https://gist.github.com/amaembo/fe03b2944cbb6e621158). – Tagir Valeev May 20 '15 at 16:22
  • @TagirValeev yes you are probably correct. Most of the time is spent inside my computation which has already unboxed everything – dkatzel May 20 '15 at 18:11
  • 3
    @dkatzel Try measuring the difference in {speed,scalability} between `IntStream.sum()` and `Stream.reduce(0, Integer::sum, Integer::sum)`. You'll see that in these cases, the difference in both speed and parallel speedup is enormous -- which is why the primitive specializations were justified for the basic operations. But as operations get more heavyweight, the benefit is lower, and boxing becomes more acceptable. – Brian Goetz Jul 08 '15 at 20:24