7

I have tricky situation, Does future.isDone() returns false, even if the thread is done.

import java.util.ArrayList;
import java.util.LinkedList;
import java.util.List;
import java.util.concurrent.ArrayBlockingQueue;
import java.util.concurrent.Callable;
import java.util.concurrent.CountDownLatch;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

public class DataAccessor {
    private static ThreadPoolExecutor executor;
    private int timeout = 100000;
    static {
        executor = new ThreadPoolExecutor(10, 10, 1000, TimeUnit.SECONDS, new ArrayBlockingQueue<Runnable>(1000));
    }

    public static void main(String[] args) {
        List<String> requests = new ArrayList<String>();
        for(int i=0; i<20; i++){
            requests.add("request:"+i);
        }
        DataAccessor dataAccessor = new DataAccessor();

        List<ProcessedResponse> results = dataAccessor.getDataFromService(requests);
        for(ProcessedResponse response:results){
            System.out.println("response"+response.toString()+"\n");
        }
        executor.shutdown();
    }

    public List<ProcessedResponse> getDataFromService(List<String> requests) {
        final CountDownLatch latch = new CountDownLatch(requests.size());
        List<SubmittedJob> submittedJobs = new ArrayList<SubmittedJob>(requests.size());
        for (String request : requests) {
            Future<ProcessedResponse> future = executor.submit(new GetAndProcessResponse(request, latch));
            submittedJobs.add(new SubmittedJob(future, request));
        }
        try {
            if (!latch.await(timeout, TimeUnit.MILLISECONDS)) {
                // some of the jobs not done
                System.out.println("some jobs not done");
            }
        } catch (InterruptedException e1) {
            // take care, or cleanup
            for (SubmittedJob job : submittedJobs) {
                job.getFuture().cancel(true);
            }
        }
        List<ProcessedResponse> results = new LinkedList<DataAccessor.ProcessedResponse>();
        for (SubmittedJob job : submittedJobs) {
            try {
                // before doing a get you may check if it is done
                if (!job.getFuture().isDone()) {
                    // cancel job and continue with others
                    job.getFuture().cancel(true);
                    continue;
                }
                ProcessedResponse response = job.getFuture().get();
                results.add(response);
            } catch (ExecutionException cause) {
                // exceptions occurred during execution, in any
            } catch (InterruptedException e) {
                // take care
            }
        }
        return results;
    }

    private class SubmittedJob {
        final String request;
        final Future<ProcessedResponse> future;

        public Future<ProcessedResponse> getFuture() {
            return future;
        }

        public String getRequest() {
            return request;
        }

        SubmittedJob(final Future<ProcessedResponse> job, final String request) {
            this.future = job;
            this.request = request;
        }
    }

    private class ProcessedResponse {
        private final String request;
        private final String response;

        ProcessedResponse(final String request, final String response) {
            this.request = request;
            this.response = response;
        }

        public String getRequest() {
            return request;
        }

        public String getResponse() {
            return response;
        }

        public String toString(){
            return "[request:"+request+","+"response:"+ response+"]";
        }
    }

    private class GetAndProcessResponse implements Callable<ProcessedResponse> {
        private final String request;
        private final CountDownLatch countDownLatch;

        GetAndProcessResponse(final String request, final CountDownLatch countDownLatch) {
            this.request = request;
            this.countDownLatch = countDownLatch;
        }

        public ProcessedResponse call() {
            try {
                return getAndProcessResponse(this.request);
            } finally {
                countDownLatch.countDown();
            }
        }

        private ProcessedResponse getAndProcessResponse(final String request) {
            // do the service call
            // ........
            if("request:16".equals(request)){
                throw (new RuntimeException("runtime"));
            }
            return (new ProcessedResponse(request, "response.of." + request));
        }
    }
}

if I call future.isDone() it returns false though the coundownLatch.await() return true. Any Idea? Also to note that the countDownLatch.await comes out immediately when this happens.

If you are finding the format not readable view here, http://tinyurl.com/7j6cvep .

yadab
  • 2,063
  • 1
  • 16
  • 24
  • Your code doesn't make sense. `pooledExecutor.submit(LCallable)` doesn't work, where do you instantiate the `Callable`? Where do you create the latch, and where do you wait for it? – skaffman Mar 07 '12 at 15:55
  • 1
    My suspicion is that the callables may throw an uncaught exception which consequently aborts the worker threads, however also count down the latch before exiting from `call`. – Péter Török Mar 07 '12 at 16:21
  • @PéterTörök which does not come under ExecutionException ? Can you please explain? – yadab Mar 07 '12 at 16:30
  • You only get an ExecitionException if you call Future.get(). I would use `for(Future future: futures) future.get();` instead of using a count down latch or checking for isDone(). – Peter Lawrey Mar 07 '12 at 16:41
  • @yadab, I mean that you aren't catching any exceptions within `call`, neither setting any handlers for e.g. uncaught exceptions. So if e.g. a null pointer exception is thrown within `call`, it will be propagated upwards, eventually aborting the worker thread without you noticing anything. – Péter Török Mar 07 '12 at 16:44
  • Yadab can you show us a working test case? – John Vint Mar 07 '12 at 17:24
  • @PéterTörök Thanks for response, My doubt and JavaDoc says if there is an exception the future should get set to done? Or is it not set to done if there is a runtime exception? – yadab Mar 07 '12 at 18:20
  • @JohnVint Can you please explain "working test case"? – yadab Mar 07 '12 at 18:21
  • Can you write up a test case with a simple `main()` method that if we were to run we would see what you are seeing. – John Vint Mar 07 '12 at 18:55
  • 1
    @yadab, I have also checked the Javadoc since and it indeed suggests that my suspicion is wrong. I still recommend to deal with exceptions in some way though whenever you use threads. – Péter Török Mar 08 '12 at 08:44
  • @JohnVint provided a detailed example. – yadab Mar 09 '12 at 19:09
  • Does this bug happen every time or only sometimes? – Tudor Mar 09 '12 at 20:17

4 Answers4

9

The issue is most likely one of timing. the latch will be released before all of the tasks are actually complete with regards to the Future (because the countDown() invocation is within the call() method).

you are basically recreating the work of a CompletionService (implementation is ExecutorCompletionService), i would recommend using that instead. you can use the poll(timeout) method to get the results. just keep track of the total time and make sure you reduce your timeout on each call to the total remaining time.

jtahlborn
  • 52,909
  • 5
  • 76
  • 118
  • where should i call countDown() then? – yadab Mar 10 '12 at 04:02
  • 1
    The countDown is in the right position but you cannot rely on the CountdownLatch to be in sync with the Future. They have their own separate synchronization controls. That is the CountDownLatch may have all parties arrived but the FutureTask has not yet been signaled its complete. You need one or the other. – John Vint Mar 10 '12 at 10:00
  • @JohnVint so, instead of checking future.IsDone, I shall directly call future.get(dummyTimeout) ? Or Is there another good way to do this? Can you explain in a separate answer? – yadab Mar 10 '12 at 10:55
  • @yadab sure I will. But this answer does answer your question in case you were planning on accepting one. – John Vint Mar 10 '12 at 10:57
3

As jtahlborn mentioned this is probably a race condition in which the CountdownLatch signals its waiting threads, which the waiting threads evaluates the Future's cancel condition before the FutureTask finishes its execution (which will occur at some point after the countDown).

You simply cannot rely on the synchronization mechanisms of the CountdownLatch to be in sync with the sync mechanisms of a Future. What you should do is rely on the Future to tell you when it is done.

You can Future.get(long timeout, TimeUnit.MILLISECONDS) instead of CountdownLatch.await(long timeout, TimeUnit.MILLISECONDS). To get the same type of effect as the latch you can add all the Futures to a List, iterate over the list and get on each Future.

John Vint
  • 39,695
  • 7
  • 78
  • 108
2

Here is the scenario of the race condition:

  • The main thread is in latch.await, it receives no CPU slots from Java scheduler for milliseconds
  • The last executor thread calls countDownLatch.countDown() in the finally clause
  • The Java scheduler decides to give more priority to the main thread because it as waited for a while
  • As a result, when it asks for the last Future result, it is not available yet because the last executor thread gets no time slice to propagate the result, it is still in finally...

I have not found a detailed explanation about how Java scheduler really works, probably because it mainly depends on the operating system running the JVM but generally speaking it tries to equally give the CPU to runnable threads in average on a period of time. That is why the main thread can reach the isDone test before the other one left the finally clause.

I propose you change your results' collect after latch.await. As you know the latch has been decreased to zero (except if main thread was interrupted), all results should be available really soon. The get method with timeout let the scheduler the chance to assign a time slice to the last thread still waiting in the finally clause:

    for (SubmittedJob job : submittedJobs) {
        try {
            ProcessedResponse response = null;
            try {
                // Try to get answer in short timeout, should be available
                response = job.getFuture().get(10, TimeUnit.MILLISECONDS);
            } catch (TimeoutException te) {
                job.getFuture().cancel(true);
                continue;
            }
            results.add(response);
        } catch (ExecutionException cause) {
            // exceptions occurred during execution, in any
        } catch (InterruptedException e) {
            // take care
        }
    }

A remark: your code is not realistic as the getAndProcessResponse method ends in less than a milliseconds. With a random sleep there, the race condition does not come out so often.

Yves Martin
  • 10,217
  • 2
  • 38
  • 77
0

I second the opinions about race conditions. I'd suggest forget about the latch and use java.util.concurrent.ThreadPoolExecutor.awaitTermination(long, TimeUnit)

Vasiliy
  • 348
  • 3
  • 10