17

The official Oracle documentation says:

Note that you may lose the benefits of parallelism if you use operations like forEachOrdered with parallel streams. Oracle - Parallelism

Why would anyone use forEachOrdered with parallel stream if we are losing parallelism?

Jacob Quisenberry
  • 1,131
  • 3
  • 20
  • 48
Sagar
  • 5,315
  • 6
  • 37
  • 66

3 Answers3

17

depending on the situation, one does not lose all the benefits of parallelism by using ForEachOrdered.

Assume that we have something as such:

stringList.parallelStream().map(String::toUpperCase)
                           .forEachOrdered(System.out::println);

In this case, we can guarantee that the ForEachOrdered terminal operation will print out the strings in uppercase in the encounter order but we should not assume that the elements will be passed to the map intermediate operation in the same order they were picked for processing. The map operation will be executed by multiple threads concurrently. So one may still benefit from parallelism but it's just that we’re not leveraging the full potential of parallelism. To conclude, we should use ForEachOrdered when it matters to perform an action in the encounter order of the stream.

edit following your comment:

What happens when you skip map operation? I am more interested in forEachOrdered right after parallelStream()

if you're referring to something as in:

 stringList.parallelStream().forEachOrdered(action);

there is no benefit in doing such thing and I doubt that's what the designers had in mind when they decided to create the method. in such case, it would make more sense to do:

stringList.stream().forEach(action);

to extend on your question "Why would anyone use forEachOrdered with parallel stream if we are losing parallelism", say you wanted to perform an action on each element with respect to the streams encounter order; in such case you will need to use forEachOrdered as the forEach terminal operation is non deterministic when used in parallel hence there is one version for sequential streams and one specifically for parallel streams.

Ousmane D.
  • 54,915
  • 8
  • 91
  • 126
  • What happens when you skip map operation? I am more interested in forEachOrdered right after `parallelStream()` – Sagar Nov 16 '17 at 19:38
  • @Kaunteya not much changes, streams *are driven* by the terminal operation anyway – Eugene Nov 16 '17 at 19:50
  • if you're referring to something like `stringList.parallelStream().forEachOrdered(action);` and to answer your question _Why would anyone use forEachOrdered with parallel stream if we are losing parallelism_ , this is because you have no choice but to. say you want to perform an _action_ on each element with respect to the streams encounter order then you will need to use it as the `forEach` terminal operation is _non deterministic_ when used in parallel. – Ousmane D. Nov 16 '17 at 19:53
  • 2
    A more interesting example would be `stringList.parallelStream().sorted().forEachOrdered(action)` – Holger Nov 16 '17 at 23:02
  • @Holger... making this an interesting one would be the addition of the `sorted`? I can't tell what makes this one a *more* interesting one :| – Eugene Nov 17 '17 at 05:49
  • 1
    @Eugene: for an ordered stream, the parallel workers have to wait at the consumer until all elements coming earlier in encounter order have been consumed, so the parallel processing of the `map` step (and all other immediately preceding stateless intermediate operations) is very limited. In contrast, a `sorted` step will perform the sorting of the entire stream in parallel without being affected by the constraints of the subsequent `forEachOrdered`. That would also apply to intermediate steps before the sorting, as those can also run concurrently without restrictions. – Holger Nov 17 '17 at 07:30
  • @Aominè Do you think `stringList.parallelStream().forEachOrdered(action);` is not only provides any benefit over `stream().forEachOrdered(action)` but it also hogs on the threads in common fork-join threadpool for achieving the same thing ? – Sagar Nov 28 '17 at 17:01
  • 1
    @Kaunteya it's possible to do `stream().forEachOrdered(action)` but ideally it's _much_ better to do `stream().forEach(action)` as the former is intended for parallel streams. i may be wrong but i doubt doing `stringList.parallelStream().forEachOrdered(action);` will actually spin up multiple threads. the java doc states that `parallelStream()` _Returns a **possibly** parallel Stream with this collection as its source. It is allowable for this method to return a sequential stream._. So i am thinking `stringList.parallelStream().forEachOrdered(action);` will only be executed in one thread. – Ousmane D. Nov 29 '17 at 21:38
  • Maybe, @Holger can enlighten us a little bit regarding your question as I am interested to know whether what I've said is valid or not. – Ousmane D. Nov 29 '17 at 21:47
  • @Aominè Look for the answer I have posted if you are interested. – Sagar Dec 03 '17 at 18:01
3

I don't really get the question here. Why? because you simply have no alternative - you have so much data that parallel streams will help you (this still needs to be proven); but yet you still need to preserve the order - thus forEachOrdered. Notice that the documentation says may and not will lose that for sure - you would have to measure and see.

Eugene
  • 117,005
  • 15
  • 201
  • 306
0

I found stream().forEachOrdered() is ~50% faster than its parallel counterpart. Plus the parallel one uses at least one thread from common fork-join thread pool, which is - one less thread for other parallel streams running in JVM.

public static void main(String[] args) { long start = System.currentTimeMillis(); IntStream.range(0,10000000).parallel().forEachOrdered(i -> { //System.out.println(Thread.currentThread().getName()); int p = 1 * 1 ; }); System.out.println((System.currentTimeMillis() - start)); }

Sagar
  • 5,315
  • 6
  • 37
  • 66