2

If I am using a map operation in a stream pipeline with forEach() terminal operation(which does not honors encounter order irrespective of whether its sequential or parallel stream) on the list (as source), will map respect the encounter order of the list in case of sequential or parallel stream ?

List<Integer> = Arrays.asList(1,2,3,4,5)
someList.stream().map(i -> i*2).forEach(System.out::println) // this is sequential stream
someList.parallelStream().map(i -> i*2).forEach(System.out::println) // this is parallel stream

If yes, then in this post https://stackoverflow.com/a/47337690/5527839, it is mentioned map operation will be performed in parallel. If order is maintained, how it will make the performance better when using parallel stream. What a point of using parallel stream?

Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
Piyush Kumar
  • 119
  • 1
  • 7
  • 1
    "_If order is maintained, how it will make the performance better_" - it will not. Maintaining order in parallel streams is likely to reduce performance. Parallel streams are no silver bullet, they don't improve everything automagically. – M. Prokhorov Apr 14 '19 at 14:15
  • 2
    Also: intermediate operations don't really "honor encounter order" per se - an intermediate operation (with an unfortunate exceptions of `.sorted()` and `.distinct()`) only ever cares about one single element (hense it is a "pipeline"), and doesn't need to care about what elements came before or will come after. Encounter order is normally only a thing for terminal operations. – M. Prokhorov Apr 14 '19 at 14:20
  • Ok... Thankyou @M.Prokhorov for clearing my doubts :) – Piyush Kumar Apr 15 '19 at 06:41
  • @M.Prokhorov Here is a good read : https://www.logicbig.com/tutorials/core-java-tutorial/java-util-stream/ordering.html It says apart from `.sorted()` and `.distinct()`, `.skip()` and `.limit()`, also cares about the encounter order irrespective of parallel or sequential stream but with parallel stream these methods has impact on performance. – Piyush Kumar Apr 15 '19 at 06:50
  • 1
    Yes, I forgot about those two. They aren't as obvious, but as a rule of thumb if an intermediate operation hides some internal state - it will care about encounter order. That makes it about 50/50 in stateful vs non-stateful intermediates on stream. – M. Prokhorov Apr 15 '19 at 17:52
  • 1
    @M.Prokhorov and starting with Java 9, `takeWhile` and `dropWhile` will also depend on the encounter order. – Holger Apr 16 '19 at 11:59
  • @Holger, yes, those too are stateful operations. – M. Prokhorov Apr 16 '19 at 12:25

2 Answers2

2

If order is maintained, how it will make the performance better when using parallel stream. What a point of using parallel stream? (yes still you will gain the performance but not expected level)

Even if you use forEachOrdered() while parallelStream the intermediate operation map will be executed by concurrent threads, but in the terminal operation orEachOrdered makes them to process in order. Try below code you will see the parallelism in map operation

List<Integer> someList = Arrays.asList(1,2,3,4,5);
            someList.stream().map(i -> {
                System.out.println(Thread.currentThread().getName()+" Normal Stream : "+i);
                return i*2;
            }).forEach(System.out::println); // this is sequential stream

            System.out.println("this is parallel stream");

            someList.parallelStream().map(i -> {
                System.out.println(Thread.currentThread().getName()+" Parallel Stream : "+i);
                return i*2;
            }).forEachOrdered(System.out::println); // this is parallel stream

will map honor encounter order ? Is ordering any way related to intermediate operations ?

If it is parallelstream map will not encounter any order, if it is normal stream then map will encounter in order, it completely depends on stream not on intermediate operation

Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
  • with what you saying can I assert that `map` does not honors the encountered order irrespective of whether terminal operation respects encountered order or not and irrespective of whether its a parallel stream or sequential stream? – Piyush Kumar Apr 14 '19 at 14:38
  • Have you tried my code, i added a print statement in `map` operation, are they printing in order while parallelstream? @PiyushKumar – Ryuzaki L Apr 14 '19 at 14:40
  • Yes map is printing in order in both parallel as well as in sequential. But for parallel stream you have used `forEachOrdered` terminal op which follows encountered order. But question is what if terminal operation does not follow the encounter order and I am using map intermediate operation with list as source(which is ordered collection hence creates ordered stream)...will map honor encounter order ? Is ordering any way related to intermediate operations ? – Piyush Kumar Apr 14 '19 at 14:50
  • common how `map` prints in order while `parallelStream`, for this question `will map honor encounter order ? Is ordering any way related to intermediate operations ?` if it is parallelstream `map` will not encounter any order, if it is normal stream then `map` will encounter in order, it completely depends on stream not on intermediate operation @PiyushKumar – Ryuzaki L Apr 14 '19 at 14:55
  • can I conclude that if terminal operation follows the encounter order then intermediate operations will also follows. M. Prokhorov also provides the similar logic here : https://stackoverflow.com/questions/55676003/does-intermediate-operations-honors-encounter-order-when-terminal-operation-used#comment98037654_55676003 – Piyush Kumar Apr 14 '19 at 15:04
  • 1
    Nothing in the specification says that the *processing order* of `map` will ever match the *encounter order*, regardless of whether the Stream is parallel or sequential. It’s an implementation specific side effect that both orders seem to match with a sequential execution. – Holger Apr 16 '19 at 11:55
1

While many intermediate operations do preserve ordering with out having to explicitly specify that desire, I always prefer to assume data flowing through the java stream api isnt guaranteed to end up ordered in every scenario even given the same code.

When the order of the elements must be preserved, it is enough to specify the terminal operation as an ordered operation and the data will be in order when it comes out. In your case I believe youd be looking for

.forEachOrdered()

If order is maintained, how it will make the performance better when using parallel stream. What a point of using parallel stream?

I've heard many opinions on this. I believe you should only use parallel streams if you are doing a non trivial amount of processing inside the pipeline, otherwise the overhead of managing the parallel stream will in most cases degrade performance when compared to serial streams. If you are doing some intensive processing, parallel will still definitely work faster than serial even when instructed to preserve the order because, after all, the data being processed is stored in a heap either way and the pointers to that data are what gets ordered and passed out of the end of the pipe. All a stream needs to do for ordering is hand the pointers out in the same order they were encountered, but it can work on the data in parallel and just wait if the data at the front of the queue isnt yet finished.

I'm sure this is a tad bit of an oversimplification, as there are cases where an ordered stream will require data to be shared from one element to the next (collectors for instance) But the basic concept is valid since even in this case a parallel stream is able to process at least two pieces of data at a time.