0

For some reason Java Stream generates more values (calls hasNext() and next() methods of iterator.

Here's the synthetic example.

I have a generator in a form of an Iterator:

@RequiredArgsConstructor
static class TestIterator implements Iterator<Integer> {
    private final int bound;
    private final Random rnd = new Random();

    private int current = 0;

    @Override public boolean hasNext() {
        return current < bound;
    }

    @Override public Integer next() {
        current = rnd.nextInt(20);
        System.out.println("Generated: " + current);
        return current;
    }
}

Now, I'm trying to have a flattened Stream which consists of few Iterators

public static void main(String... args) {

    List<Iterator<Integer>> iterators = asList(
        new TestIterator(18),
        new TestIterator(18),
        new TestIterator(18));
    Stream<Integer> streams = iterators.stream()
        .map(iter -> (Iterable<Integer>) () -> iter)
        .flatMap(iter -> StreamSupport.stream(iter.spliterator(), false)) // <-- Here the stream of streams is flatten to a single stream of integers and 'parallel' is set to false
        .limit(5); // <-- Here the limit is set

    streams.forEach(i -> System.out.println("***Consumed: " + i));
}

And, surprisingly for me, the output is following:

Generated: 1
***Consumed: 1
Generated: 19
***Consumed: 19
Generated: 7
***Consumed: 7
Generated: 7
***Consumed: 7
Generated: 7
***Consumed: 7
Generated: 4
Generated: 3
Generated: 8
Generated: 14
Generated: 0
Generated: 16
Generated: 10
Generated: 3
Generated: 19

So, Stream generates more results than passed to the consumer in forEach. Even despite it's explicitly set 'parallel = false'.

In my real-world scenario hasNext() and next() functions are very expensive, taking data from external services.

Can anybody explain how to do a better job on limiting results?

Thanks in advance.

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
Viktor Molokanov
  • 272
  • 2
  • 15

2 Answers2

5

This is a known JDK bug which was fixed in JDK 10+ and backported to openjdk8u222, thus updating your Java version will address the issue.

geobreze
  • 2,274
  • 1
  • 10
  • 15
  • I've updated JDK to 1.8.0_281 and, interestingly, the issue is still there. It really works well with one iterator converted to a stream, stopping right after consuming of `limit`, but it still doesn't work with a list of streams created from iterators and flattened. – Viktor Molokanov Apr 19 '21 at 09:49
  • The problem is - it generates *everything* from the first stream in the chain. But what if the first stream is infinite? Like: ``` Iterable iter = () -> new TestIterator(20); Stream stream1 = StreamSupport.stream(iter.spliterator(), false).limit(10); stream1.forEach() ``` Then the program will run infinitely (just checked and verified that _281 version is used). – Viktor Molokanov Apr 19 '21 at 09:54
  • I've checked this issue on java 11.0.6 2020-01-14 LTS Java(TM) SE Runtime Environment 18.9 (build 11.0.6+8-LTS) Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.6+8-LTS, mixed mode) and everything was working fine with an infinite iterator. Code: https://gist.github.com/geobreze/712f667ae2d53a9307f3ada549dedefa – geobreze Apr 19 '21 at 10:26
  • Also, just checked this code on openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0_222-b10) OpenJDK 64-Bit Server VM (build 25.222-b10, mixed mode) Using docker container `openjdk:8u222-slim-buster` and got following output: Generated: 9 ***Consumed: 9 Generated: 17 ***Consumed: 17 Generated: 2 ***Consumed: 2 Generated: 8 ***Consumed: 8 Generated: 18 ***Consumed: 18 – geobreze Apr 19 '21 at 10:28
  • I confirm that using OpenJDK 1.8.0_222_b10 it works perfectly. My confusion came from the fact that I tried it on ORACLE JDK 1.8.0_281 and the issue is still there. It's interesting why some bug is fixed in OpenJDK and is not fixed in "official" one. Thanks a lot for your help. – Viktor Molokanov Apr 19 '21 at 11:09
0

As I can see in your code here You have made the integer value bound final

And because it is an immutable value that the reason your hasNext() function call is running without stop.

Dharman
  • 30,962
  • 25
  • 85
  • 135
  • I'm not sure I understand that. What is an immutable value? `current` variable is set inside `next()` method. And even if `hasNext()` will return true all the time, what is the reason for omitting more values than limit set? – Viktor Molokanov Apr 18 '21 at 16:43