-1

This is my sequential synchronous code:

int pageNumber = 0;
int pageSize = 100;
ResultSetType resultSet;
do {
    resultSet = this.serveiTerritorialClientRepository
        .getOid("2.16.724.4.400", pageNumber, pageSize);
            
    // do something with resultSet
            
    pageNumber++;
} while (resultSet.getResultCount() < pageSize)

Note: resultSet has a method in order to get the total amount of items: resultSet.getTotalCount().

I'd like to parallelize this code using CompletableFutures.

I know how to build CompletableFuture:

CompletableFuture<ResultSetType> completableFutureOfResultSetType = 
    CompletableFuture.supplyAsync(() -> this.serveiTerritorialClientRepository
        .getOid("2.16.724.4.400", pageNumber, pageSize)
);

The problem is how to coordinate all of them. I mean, how many CompletableFuture I need to create?

Jordi
  • 20,868
  • 39
  • 149
  • 333
  • so you think that making many `CompletableFuture`s is going to make your code faster? is that the reason you want to do this? If so, it will not, you will need to coordinate each of them to be scheduled one after another, because you have a dependency between them all. – Eugene Jun 14 '21 at 14:29
  • @Eugene about the comment you left on my answer: you're right, I didn't read properly. Indeed, I agree with you, the task is simply not eligible to parallelization. – Matteo NNZ Jun 14 '21 at 15:02
  • @Eugene I don't agree. Since each thread can handle one page. – Jordi Jun 14 '21 at 15:19
  • are you saying that you want to create many `CompletableFuture`s _first_ and then wait for all of them to finish? – Eugene Jun 14 '21 at 16:11
  • The problem is, how many `CompletableFuture`s do I need to create since I dont know how many total count of items to get... – Jordi Jun 14 '21 at 19:55
  • I knew your question was incorrect first time I've read it. But now you did not make it any better, what is the point to mention `resultSet.getTotalCount()` if it's not used anywhere in your example? – Eugene Jun 14 '21 at 20:09
  • Only just in case it could help... – Jordi Jun 14 '21 at 20:11
  • `getResultCount` is meant for obtain the number of items in returned page. `getTotalCount` is meant for getting all items behing the service. For example, `user_tbl` has 1000 users, but page result can contain only 56... – Jordi Jun 14 '21 at 20:16
  • and now read my first comment. Do you or do you not have a dependency between all the futures? you can only build your next future when you know the result of the first. doesn't it? and I'll repeat : unless you post a comment with `@Eugene` - there is no way for me to know you did – Eugene Jun 14 '21 at 20:24

1 Answers1

0

It is as simple as :

 List<CompletableFuture<ResultSetType>> futures = new ArrayList<>();
 while (resultSet.getResultCount() < pageSize) {
    CompletableFuture<ResultSetType> completableFutureOfResultSetType = 
        CompletableFuture.supplyAsync(
            () -> this.serveiTerritorialClientRepository.getOid("2.16.724.4.400", pageNumber, pageSize)
        );
    futures.add(completableFutureOfResultSetType);       
    pageNumber++;
 }

 CompletableFuture<Void> all = CompletableFuture.allOf(futures.toArray(new CompletableFuture<ResultSetType>[0]));  

The problem is that all is a CompletableFuture<Void>, so you would then need to join each of them to get the resultSet:

futures.stream().map(CompletableFuture::join).collect(Collectors.toList());

The result of the above is a List of your resultSet. Mind that even if you called join, there is no joining per-se, as it already happened because of CompletableFuture::allOf.

Then, another problem that might arise, is the fact that some of the futures you supply to allOff might fail. As such you will need to filter the ones that have failed and retry for them? It depends on your usage. If you want to fail if at least one failed, you need to inspect: all.isCompletedExceptionally. The documentation of allOf mentions that if alt least one failed, all.isCompletedExceptionally will report failed too.


As to how many features you need, it depends.

In the example that you provide, you have no pool defined for supplyAsync, as such - you might get a lot of threads involved. This Q&A explains why. In general it is highly recommended to provide a pool for your actions (supplyAsync has an overloaded method that takes such an argument). In that case, it does not even matter how many futures you would be creating - as the pool threads would be responsible to schedule them. And the answer to how many you would need - is impossible without measuring. Usually (we have observed that in our usages), a few threads (2-4) is more then enough.

Eugene
  • 117,005
  • 15
  • 201
  • 306
  • Sorry, @Eugene. I miswrote post. I've edited it. The problem is that `resultSet` is undefined first time. Sequencially, first I get first `resultSet`. When `resultSet.getResultCount() < pageSize` loop stops. I've also added a note: `resultSet` has a `getTotalCount` in order to get how many items are there in total. – Jordi Jun 14 '21 at 20:05
  • @Jordi this does not change the answer much. You would need to make the first call to `getOid` outside of the loop and build the needed futures - in the loop. – Eugene Jun 14 '21 at 20:37