1

Is there a way of trying to wait for a CompletableFuture a certain amount of time before giving a different result back without cancelling the future after timing out?

I have a service (let's call it expensiveService) that runs off to do its own thing. It returns a result:

enum Result {
    COMPLETED,
    PROCESSING,
    FAILED
}

I'm willing to [block and] wait for it for a short amount of time (let's say 2 s). If it doesn't finish, I want to return a different result, but I want the service to carry on doing its own thing. It would be the client's job to then inquire as to whether the service is finished or not (e.g. through websockets or whatever).

I.e. we have the following cases:

  • expensiveService.processAndGet() takes 1 s and completes its future. It returns COMPLETED.
  • expensiveService.processAndGet() fails after 1 s. It returns FAILED.
  • expensiveService.processAndGet() takes 5 s and completes its future. It returns PROCESSING. If we ask another service for the result, we get COMPLETED.
  • expensiveService.processAndGet() fails after 5 s. It returns PROCESSING. If we ask another service for the result, we get FAILED.

In this specific case, we actually need to fetch the current result object anyway on a timeout, resulting in the following additional edge-case. This causes some issues with the solutions suggested below:

  • expensiveService.processAndGet() takes 2.01 s and completes its future. It returns either PROCESSING or COMPLETED.

I'm also using Vavr and am open to suggestions using Vavr's Future.

We have created three possible solutions which all have their own positives and negatives:

#1 Wait for another Future

CompletableFuture<Result> f = expensiveService.processAndGet();
return f.applyToEither(Future.of(() -> {
            Thread.sleep(2000);
            return null;
        }).map(v -> resultService.get(processId)).toCompletableFuture(),
        Function.identity());

Problems

  1. The second resultService is always called.
  2. We take up the entire Thread for 2 s.

#1a Wait for another Future that checks the first Future

CompletableFuture<Result> f = expensiveService.processAndGet();
return f.applyToEither(Future.of(() -> {
            int attempts = 0;
            int timeout = 20;
            while (!f.isDone() && attempts * timeout < 2000) {
                Thread.sleep(timeout);
                attempts++;
            }
            return null;
        }).map(v -> resultService.get(processId)).toCompletableFuture(),
        Function.identity());

Problems

  1. The second resultService is still always called.
  2. We need to pass the first Future to the second, which isn't so clean.

#2 Object.notify

Object monitor = new Object();
CompletableFuture<Upload> process = expensiveService.processAndGet();
synchronized (monitor) {
    process.whenComplete((r, e) -> {
        synchronized (monitor) {
            monitor.notifyAll();
        }
    });
    try {
        int attempts = 0;
        int timeout = 20;
        while (!process.isDone() && attempts * timeout < 2000) {
            monitor.wait(timeout);
            attempts++;
        }
    } catch (InterruptedException e) {
        Thread.currentThread().interrupt();
    }
}
if (process.isDone()) {
    return process.toCompletableFuture();
} else {
    return CompletableFuture.completedFuture(resultService.get(processId));
}

Problems

  1. Complex code (potential for bugs, not as readable).

#3 Vavr's Future.await

return Future.of(() -> expensiveService.processAndGet()
        .await(2, TimeUnit.SECONDS)
        .recoverWith(e -> {
            if (e instanceof TimeoutException) {
                return Future.successful(resultService.get(processId));
            } else {
                return Future.failed(e);
            }
        })
        .toCompletableFuture();

Problems

  1. Needs a Future in a Future to avoid await cancelling the inner Future.
  2. Moving the first Future into a second breaks [legacy] code that relies on ThreadLocals.
  3. recoverWith and catching the TimeoutException isn't that elegant.

#4 CompletableFuture.orTimeout

return expensiveService.processAndGet()
        .orTimeout(2, TimeUnit.SECONDS)
        .<CompletableFuture<Upload>>handle((u, e) -> {
            if (u != null) {
                return CompletableFuture.completedFuture(u);
            } else if (e instanceof TimeoutException) {
                return CompletableFuture.completedFuture(resultService.get(processId));
            } else {
                return CompletableFuture.failedFuture(e);
            }
        })
        .thenCompose(Function.identity());

Problems

  1. Although in my case the processAndGet future is not cancelled, according to docs, it should be.
  2. The exception handling is not nice.

#5 CompletableFuture.completeOnTimeout

return expensiveService.processAndGet()
        .completeOnTimeout(null, 2, TimeUnit.SECONDS)
        .thenApply(u -> {
            if (u == null) {
                return resultService.get(processId);
            } else {
                return u;
            }
        });

Problems

  1. Although in my case the processAndGet future is not completed, according to docs, it should be.
  2. What if processAndGet wanted to return null as a different state?

All of these solutions have disadvantages and require extra code but this feels like something that should be supported either by CompletableFuture or Vavr's Future out-of-the-box. Is there a better way to do this?

Druckles
  • 3,161
  • 2
  • 41
  • 65
  • 1
    I would use [CompletableFuture.completeOnTimeout](https://docs.oracle.com/en/java/javase/13/docs/api/java.base/java/util/concurrent/CompletableFuture.html#completeOnTimeout%28T,long,java.util.concurrent.TimeUnit%29) for this; as in, `expensiveService.completeOnTimeout(Result.TIMED_OUT, 2, TimeUnit.SECONDS).get()`. But I don’t understand your 2.01 requirement; if 2.01 seconds is a valid duration, why not pass 2.01 seconds or more as a timeout? – VGR Jan 16 '20 at 17:14
  • Thank you. I will have a look tomorrow to see if this does what I'm looking for. – Druckles Jan 16 '20 at 21:58
  • The 2.01 s isn't exactly a valid duration, it's simply that it might come to a "race condition" where the result is fetched *after* the future has completed anyway. E.g. the first service actually needed 2.01 s, but the waiting and fetching of the result in its current state actually takes 2.02 s. It's just an arbitrary number that is close to 2 s but greater than it. – Druckles Jan 16 '20 at 21:59
  • Assuming the CompletableFuture properly handles interrupts, you can interrupt it after two seconds with [cancel(true)](https://docs.oracle.com/en/java/javase/13/docs/api/java.base/java/util/concurrent/CompletableFuture.html#cancel%28boolean%29). That’s as thread-safe as you can get—either the task was interrupted or it finished. No chance that it slipped through after the interrupt was issued. – VGR Jan 16 '20 at 23:52
  • Perhaps I still haven't explained well enough. The first CompletableFuture should (*must*) continue after the timeout. It can't be cancelled. The whole thing with 2.01 s is not that critical, it just means: if `expensiveService.processAndGet()` takes 2.01 s, the timeout function takes 2 s and the follow up function `resultService.get()` takes 0.2 s. Which means the result of `resultService.get()` is ignored (it takes 2.02 s, the first future takes 2.01 s). This doesn't happen if you do an `orTimeout` or `completeOnTimeout` or similar. – Druckles Jan 17 '20 at 08:38
  • `completeOnTimeout` looks like it does what I want, *but* doesn't take a supplier, meaning we can't call `resultService.get()`. – Druckles Jan 17 '20 at 09:00
  • `orTimeout` does perform like Vavr Future's `await` without cancelling/completing the future. However, the docs are unclear as to whether this is intentional. Plus the `exceptionally` function in CompletableFuture does not return a CompletableFuture, meaning handling errors other than `TimeoutException` isn't great. – Druckles Jan 17 '20 at 09:03
  • I've added two more solutions using your suggestions @VGR but they're still not ideal. Thank you. – Druckles Jan 17 '20 at 09:58
  • “What if processAndGet wanted to return null as a different state?” Instead of null, do `Object timedOut = new Object();` before calling processAndGet. Then use `timedOut` instead of null, both in completeOnTimeout and in your equality check. It is a unique object that no other code can ever return. – VGR Jan 17 '20 at 14:21
  • It has to be of the same time as `processAndGet()`. In this example it was an enum, but it may any object that semantically equals `timedOut`. – Druckles Jan 17 '20 at 14:50
  • Good point; the CompletableFuture returned by processAndGet would need to have thenApply called on it, mapping the result to a wrapper object (possibly a different enum), so you could use a unique, separate value. – VGR Jan 17 '20 at 14:56
  • 1
    How do you determine that `orTimeout` does not complete the future as documented? – Holger Jan 20 '20 at 16:23
  • 1
    @VGR `CompletableFuture` does not support interruption, in general. As the documentation says “*Method `cancel` has the same effect as `completeExceptionally(new CancellationException())`*” – Holger Jan 20 '20 at 16:39

1 Answers1

3

It’s worth pointing out first, how CompletableFuture work (or why it is named like it is):

CompletableFuture<?> f = CompletableFuture.supplyAsync(supplier, executionService);

is basically equivalent to

CompletableFuture<?> f = new CompletableFuture<>();
executionService.execute(() -> {
    if(!f.isDone()) {
        try {
            f.complete(supplier.get());
        }
        catch(Throwable t) {
            f.completeExceptionally(t);
        }
    }
});

There is no connection from the CompletableFuture to the code being executed by the Executor, in fact, we can have an arbitrary number of ongoing completion attempts. The fact that a particular code is intended to complete a CompletableFuture instance, becomes apparent only when one of the completion methods is called.

Therefore, the CompletableFuture can not affect the running operation in any way, this includes interrupting on cancellation or such alike. As the documentation of CompletableFuture says:

Method cancel has the same effect as completeExceptionally(new CancellationException())

So a cancellation is just another completion attempt, which will win if it is the first one, but not affect any other completion attempt.

So orTimeout(long timeout, TimeUnit unit) is not much different in this regard. After the timeout elapsed, it will perform the equivalent to completeExceptionally(new TimeoutException()), which will win if no other completion attempt was faster, which will affect dependent stages, but not other ongoing completion attempts, e.g. what expensiveService.processAndGet() has initiated in your case.

You can implement the desired operation like

CompletableFuture<Upload> future = expensiveService.processAndGet();
CompletableFuture<Upload> alternative = CompletableFuture.supplyAsync(
    () -> resultService.get(processId), CompletableFuture.delayedExecutor(2, TimeUnit.SECONDS));
return future.applyToEither(alternative, Function.identity())
    .whenComplete((u,t) -> alternative.cancel(false));

With delayedExecutor we use the same facility as orTimeout and completeOnTimeout. It doesn’t evaluate the specified Supplier before the specified time or not at all when the cancellation in future.whenComplete is faster. The applyToEither will provide whichever result is available faster.

This doesn’t complete the future on timeout, but as said, its completion wouldn’t affect the original computation anyway, so this would also work:

CompletableFuture<Upload> future = expensiveService.processAndGet();
CompletableFuture.delayedExecutor(2, TimeUnit.SECONDS)
    .execute(() -> {
        if(!future.isDone()) future.complete(resultService.get(processId));
    });
return future;

this completes the future after the timeout, as said, without affecting ongoing computations, but providing the alternative result to the caller, but it wouldn’t propagate exceptions throw by resultService.get(processId) to the returned future.

Holger
  • 285,553
  • 42
  • 434
  • 765
  • Thanks. That first solution looks like it'll do what I want. I'd gone for `Object.await/notify`, but that's much more elegant. – Druckles Jan 20 '20 at 18:25
  • However, if `processAndGet()` consists of several stages, or it takes a while for it to begin, it can still be completed in advance (e.g. in the second solution) and thereby not start or finish its work. That's what I'm trying to avoid. – Druckles Jan 20 '20 at 18:26
  • 1
    I see. Then, it might be a good move, if `processAndGet()` itself calls `copy()` on the last stage before returning, when it has required side effects. – Holger Jan 21 '20 at 07:45
  • But it's the calling function which determines whether it's required or not. The calling function is (in this case) a Web Controller that is willing to wait a little bit for a response but can't afford to wait longer. If it does take too long, it's desired that the computation continues (and the result fetched from elsewhere). For that reason, the first solution you posted would be the best solution. – Druckles Jan 21 '20 at 10:05
  • The first time it ran I received a `CancellationException` from the `alternative` future. I assume it was a race condition that lead to this and I've moved the `whenComplete()` after the `applyToEither()` with a check for `!alternative.isDone()` which should solve the problem. Thank you again. – Druckles Jan 21 '20 at 10:07
  • 1
    The phrase “it's desired that the computation continues” implies that the background computation stores the results somewhere else, rather than only in the abandoned future, as otherwise “and the result fetched from elsewhere” would not be possible. So if there’s such a side effect, the method `processAndGet()` should care about this computation state. Even when it turns out to be unneeded, it doesn’t harm. It could also use `.minimalCompletionStage()` for its return value, to denote that the caller is not supposed to manipulate the completion state. – Holger Jan 21 '20 at 10:12
  • 2
    Re `CancellationException`, I see the mistake I made. I fixed it in the answer. There’s no need for an `isDone` check, as trying to complete an already completed future has no effect anyway. But it’s required to make the cancellation dependent on the *either* stage to be sure it has completed before canceling the *alternative* stage which then can not have any effect on the *either* stage. – Holger Jan 21 '20 at 10:16