0

I have a simple function, and simple question:
Do I need the first toList()?

Arrays.stream(items)
   // build task and submit to ExecutionService, 
   // which returns a Future
   .map(item -> executeTask(item)) //Stream<Future<Result>>

   // generate all the Futures, stash them in a collection, 
   // then wait for them to complete
   .toList() //List<Future<Result>>
   .stream() //Stream<Future<Result>>

   //wait for result
   .map(Future::get) //Stream<Result>
   .toList();

More details, the function

  1. takes in an array of items
  2. do some task on each item and submit the task to an ExecutionService, which returns a Future for each item
  3. wait for the result for all items before returning the list of results

My friend argues that we need that extra toList() (after .map(item -> executeTask(item)) operation because

Streams are lazily evaluated so the map() call isn't invoked until a terminal operation (toList()) is called. So we want to generate all the Futures, stash them in a collection, then wait for them to complete

But what I feel is that the function will work the same way even without that extra toList() since we already have a toList() after .map(Future::get).
So is it really needed?

shmosel
  • 49,289
  • 6
  • 73
  • 138
wayne
  • 598
  • 3
  • 15
  • 1
    Streams process elements one by one, so if you didn't have that intermediate `toList()` then you'd be scheduling a task, waiting for it to complete, then scheduling the next task, waiting for _that_ task to complete, and so on (i.e., your code essentially becomes single-threaded). The intermediate `toList()` makes sure that all tasks are scheduled at once, and then you stream on the collection of `Future`s to wait for all of them to complete (though you might want to look into the `ExecutorService.invokeAll` methods). – Slaw Mar 16 '22 at 05:29
  • thanks @Slaw I've written a short code and it proves you're right. Just wondering if this one by one mechanism is documented anywhere? – wayne Mar 16 '22 at 05:49
  • I'm not sure if it's documented explicitly, but you can infer it from the fact it's possible to short-circuit infinite streams. Otherwise, `Stream.generate(this::createTask).map(this::executeTask).findFirst();` would never complete. – shmosel Mar 16 '22 at 06:04
  • 3
    It’s not specified in this way, because you should not rely on it. If an alternative Stream implementation finds a way to improve performance by processing two at a time, that would be fine. – Holger Mar 16 '22 at 11:17

0 Answers0