17

I have a library which is being used by customer and they are passing DataRequest object which has userid, timeout and some other fields in it. Now I use this DataRequest object to make a URL and then I make an HTTP call using RestTemplate and my service returns back a JSON response which I use it to make a DataResponse object and return this DataResponse object back to them.

Below is my DataClient class used by customer by passing DataRequest object to it. I am using timeout value passed by customer in DataRequest to timeout the request if it is taking too much time in getSyncData method.

public class DataClient implements Client {

    private RestTemplate restTemplate = new RestTemplate();
    // first executor
    private ExecutorService service = Executors.newFixedThreadPool(15);

    @Override
    public DataResponse getSyncData(DataRequest key) {
        DataResponse response = null;
        Future<DataResponse> responseFuture = null;

        try {
            responseFuture = getAsyncData(key);
            response = responseFuture.get(key.getTimeout(), key.getTimeoutUnit());
        } catch (TimeoutException ex) {
            response = new DataResponse(DataErrorEnum.CLIENT_TIMEOUT, DataStatusEnum.ERROR);
            responseFuture.cancel(true);
            // logging exception here               
        }

        return response;
    }   

    @Override
    public Future<DataResponse> getAsyncData(DataRequest key) {
        DataFetcherTask task = new DataFetcherTask(key, restTemplate);
        Future<DataResponse> future = service.submit(task);

        return future;
    }
}

DataFetcherTask class:

public class DataFetcherTask implements Callable<DataResponse> {

    private DataRequest key;
    private RestTemplate restTemplate;

    public DataFetcherTask(DataRequest key, RestTemplate restTemplate) {
        this.key = key;
        this.restTemplate = restTemplate;
    }

    @Override
    public DataResponse call() throws Exception {
        // In a nutshell below is what I am doing here. 
        // 1. Make an url using DataRequest key.
        // 2. And then execute the url RestTemplate.
        // 3. Make a DataResponse object and return it.

        // I am calling this whole logic in call method as LogicA
    }
}

As of now my DataFetcherTask class is responsible for one DataRequest key as shown above..

Problem Statement:-

Now I have a small design change. Customer will pass DataRequest (for example keyA) object to my library and then I will make a new http call to another service (which I am not doing in my current design) by using user id present in DataRequest (keyA) object which will give me back list of user id's so I will use those user id's and make few other DataRequest (keyB, keyC, keyD) objects one for each user id returned in the response. And then I will have List<DataRequest> object which will have keyB, keyC and keyD DataRequest object. Max element in the List<DataRequest> will be three, that's all.

Now for each of those DataRequest object in List<DataRequest> I want to execute above DataFetcherTask.call method in parallel and then make List<DataResponse> by adding each DataResponse for each key. So I will have three parallel calls to DataFetcherTask.call. Idea behind this parallel call is to get the data for all those max three keys in the same global timeout value.

So my proposal is - DataFetcherTask class will return back List<DataResponse> object instead of DataResponse and then signature of getSyncData and getAsyncData method will change as well. So here is the algorithm:

  • Use DataRequest object passed by customer to make List<DataRequest> by calling another HTTP service.
  • Make a parallel call for each DataRequest in List<DataRequest> to DataFetcherTask.call method and return List<DataResponse> object to customer instead of DataResponse.

With this way, I can apply same global timeout on step 1 along with step 2 as well. If either of above step is taking time, we will just timeout in getSyncData method.

DataFetcherTask class after design change:

public class DataFetcherTask implements Callable<List<DataResponse>> {

    private DataRequest key;
    private RestTemplate restTemplate;
    // second executor here
    private ExecutorService executorService = Executors.newFixedThreadPool(10);

    public DataFetcherTask(DataRequest key, RestTemplate restTemplate) {
        this.key = key;
        this.restTemplate = restTemplate;
    }

    @Override
    public List<DataResponse> call() throws Exception {
        List<DataRequest> keys = generateKeys();
        CompletionService<DataResponse> comp = new ExecutorCompletionService<>(executorService);

        int count = 0;
        for (final DataRequest key : keys) {
            comp.submit(new Callable<DataResponse>() {
                @Override
                public DataResponse call() throws Exception {
                    return performDataRequest(key);
                }
            });
        }

        List<DataResponse> responseList = new ArrayList<DataResponse>();
        while (count-- > 0) {
            Future<DataResponse> future = comp.take();
            responseList.add(future.get());
        }
        return responseList;
    }

    // In this method I am making a HTTP call to another service
    // and then I will make List<DataRequest> accordingly.
    private List<DataRequest> generateKeys() {
        List<DataRequest> keys = new ArrayList<>();
        // use key object which is passed in contructor to make HTTP call to another service
        // and then make List of DataRequest object and return keys.
        return keys;
    }       

    private DataResponse performDataRequest(DataRequest key) {
        // This will have all LogicA code here which is shown in my original design.
        // everything as it is same..
    }
}

Now my question is -

  • Does it have to be like this? What is the right design to solve this problem? I mean having call method in another call method looks weird?
  • Do we need to have two executors like I have in my code? Is there any better way to solve this problem or any kind of simplification/design change we can do here?

I have simplified the code so that idea gets clear what I am trying to do..

john
  • 11,311
  • 40
  • 131
  • 251
  • Calling a method on a collection might reasonably call a method on each element. It's a fairly common pattern. You could have one global ExecutorService for all clients although using a ForkJoinPool might be a better choice as you want the waiting thread to do some work while it's waiting. – Peter Lawrey Dec 16 '15 at 17:55
  • So how can I have global `ExecutorService` for all my clients in my current scenario? Also where do you see I have a waiting thread because of which I should explore `ForkJoinPool`? I mean why you were recommending `ForkJoinPool` here? – john Dec 16 '15 at 18:12
  • You make a field global by making it `static`. `Future.get()` is a blocking operation. Imagine you have many threads doing this in the same pool as the one doing actual work. You could get to the point where all the threads in the pool are blocked. – Peter Lawrey Dec 16 '15 at 18:14
  • 1
    Which version of Java are you using? If Java 8 is available, `CompletableFuture`s offer a powerful framework for chaining tasks together. If not, Guava's `ListenableFuture` provides similar functionality, but without taking advantage of Java 8's lambdas. – Andrew Rueckert Dec 19 '15 at 05:45
  • I am on Java 7 as of now. It will take some time for my company to start using Java 8. :( ... Ohh I see, I can use `ListenableFuture` looks like but I have never used it before so not sure how it can be used in my example. – john Dec 19 '15 at 06:13
  • @AndrewRueckert If possible can you provide an example with `Guava's ListenableFuture`. I am not sure how can I use it here? Any example will help me a lot here. – john Dec 24 '15 at 06:48
  • Sometimes it's a way much better to give a partial result instead nothing if one of couple requests fails (or didn't finish in time), as it can happen in `getSyncData(...)`. Do you need such option in you library? – dezhik Dec 25 '15 at 16:36
  • @dezhik Yes I think so but I guess in my code if one request fails at server side, my code will return partial response I mean success for other requests and failure for the one failed right? no? But at client side if it is taking more than global timeout, it will timeout everything.. This is what is happening right now in my code. – john Dec 25 '15 at 16:39
  • `if one request fails at server side, my code will return partial response I mean success for other requests and failure for the one failed right`. I don't see you current `performDataRequest(DataRequest key)` implementation, but in your post at CodeReview it was something like that.
    In my previous comment I talked about ability of returning partial result even after global timeout.
    – dezhik Dec 25 '15 at 17:38
  • Yes whatever I had in CR, it's same thing here as well. Just wanted to make it simpler so I have remove the code here. And looks like I learned about ForkJoin framework after Peter/Stefan suggested me. Can we return back partial response in case of global timeout? – john Dec 25 '15 at 18:29
  • Check this library https://github.com/Netflix/Hystrix – mommcilo Dec 26 '15 at 00:39

2 Answers2

4

As already mentioned in the comments of your question, you can use Java's ForkJoin framework. This will save you the extra thread pool within your DataFetcherTask.

You simply need to use a ForkJoinPool in your DataClient and convert your DataFetcherTask into a RecursiveTask (one of ForkJoinTask's subtypes). This allows you to easily execute other subtasks in parallel.

So, after these modifications your code will look something like this:

DataFetcherTask

The DataFetcherTask is now a RecursiveTask which first generates the keys and invokes subtasks for each generated key. These subtasks are executed in the same ForkJoinPool as the parent task.

public class DataFetcherTask extends RecursiveTask<List<DataResponse>> {

  private final DataRequest key;
  private final RestTemplate restTemplate;

  public DataFetcherTask(DataRequest key, RestTemplate restTemplate) {
      this.key = key;
      this.restTemplate = restTemplate;
  }

  @Override
  protected List<DataResponse> compute() {
    // Create subtasks for the key and invoke them
    List<DataRequestTask> requestTasks = requestTasks(generateKeys());
    invokeAll(requestTasks);

    // All tasks are finished if invokeAll() returns.
    List<DataResponse> responseList = new ArrayList<>(requestTasks.size());
    for (DataRequestTask task : requestTasks) {
      try {
        responseList.add(task.get());
      } catch (InterruptedException | ExecutionException e) {
        // TODO - Handle exception properly
        Thread.currentThread().interrupt();
        return Collections.emptyList();
      }
    }

    return responseList;
  }

  private List<DataRequestTask> requestTasks(List<DataRequest> keys) {
    List<DataRequestTask> tasks = new ArrayList<>(keys.size());
    for (DataRequest key : keys) {
      tasks.add(new DataRequestTask(key));
    }

    return tasks;
  }

  // In this method I am making a HTTP call to another service
  // and then I will make List<DataRequest> accordingly.
  private List<DataRequest> generateKeys() {
      List<DataRequest> keys = new ArrayList<>();
      // use key object which is passed in contructor to make HTTP call to another service
      // and then make List of DataRequest object and return keys.
      return keys;
  }

  /** Inner class for the subtasks. */
  private static class DataRequestTask extends RecursiveTask<DataResponse> {

    private final DataRequest request;

    public DataRequestTask(DataRequest request) {
      this.request = request;
    }

    @Override
    protected DataResponse compute() {
      return performDataRequest(this.request);
    }

    private DataResponse performDataRequest(DataRequest key) {
      // This will have all LogicA code here which is shown in my original design.
      // everything as it is same..
      return new DataResponse(DataErrorEnum.OK, DataStatusEnum.OK);
    }
  }

}

DataClient

The DataClient will not change much except for the new thread pool:

public class DataClient implements Client {

  private final RestTemplate restTemplate = new RestTemplate();
  // Replace the ExecutorService with a ForkJoinPool
  private final ForkJoinPool service = new ForkJoinPool(15);

  @Override
  public List<DataResponse> getSyncData(DataRequest key) {
      List<DataResponse> responsList = null;
      Future<List<DataResponse>> responseFuture = null;

      try {
          responseFuture = getAsyncData(key);
          responsList = responseFuture.get(key.getTimeout(), key.getTimeoutUnit());
      } catch (TimeoutException | ExecutionException | InterruptedException ex) {
          responsList = Collections.singletonList(new DataResponse(DataErrorEnum.CLIENT_TIMEOUT, DataStatusEnum.ERROR));
          responseFuture.cancel(true);
          // logging exception here
      }

      return responsList;
  }

  @Override
  public Future<List<DataResponse>> getAsyncData(DataRequest key) {
      DataFetcherTask task = new DataFetcherTask(key, this.restTemplate);
      return this.service.submit(task);
  }
}

Once you are on Java8 you may consider changing the implementation to CompletableFutures. Then it would look something like this:

DataClientCF

public class DataClientCF {

  private final RestTemplate restTemplate = new RestTemplate();
  private final ExecutorService executor = Executors.newFixedThreadPool(15);

  public List<DataResponse> getData(DataRequest initialKey) {
    return CompletableFuture.supplyAsync(() -> generateKeys(initialKey), this.executor)
      .thenApply(requests -> requests.stream().map(this::supplyRequestAsync).collect(Collectors.toList()))
      .thenApply(responseFutures -> responseFutures.stream().map(future -> future.join()).collect(Collectors.toList()))
      .exceptionally(t -> { throw new RuntimeException(t); })
      .join();
  }

  private List<DataRequest> generateKeys(DataRequest key) {
    return new ArrayList<>();
  }

  private CompletableFuture<DataResponse> supplyRequestAsync(DataRequest key) {
    return CompletableFuture.supplyAsync(() -> new DataResponse(DataErrorEnum.OK, DataStatusEnum.OK), this.executor);
  }
}

As mentioned in the comments, Guava's ListenableFutures would provide similar functionality for Java7 but without Lambdas they tend to get clumsy.

Stefan Ferstl
  • 5,135
  • 3
  • 33
  • 41
1

As I know, RestTemplate is blocking, it is said in ForkJoinPool JavaDoc in ForkJoinTask:

Computations should avoid synchronized methods or blocks, and should minimize other blocking synchronization apart from joining other tasks or using synchronizers such as Phasers that are advertised to cooperate with fork/join scheduling. ...
Tasks should also not perform blocking IO,...

Call in call is redundant.
And you don't need two executors. Also you can return partial result in getSyncData(DataRequest key). This can be done like this

DataClient.java

public class DataClient implements Client {

    private RestTemplate restTemplate = new RestTemplate();
    // first executor
    private ExecutorService service = Executors.newFixedThreadPool(15);

    @Override
    public List<DataResponse> getSyncData(DataRequest key) {
        List<DataResponse> responseList = null;
        DataFetcherResult response = null;
        try {
            response = getAsyncData(key);
            responseList = response.get(key.getTimeout(), key.getTimeoutUnit());
        } catch (TimeoutException ex) {
            response.cancel(true);
            responseList = response.getPartialResult();
        }
        return responseList;
    }

    @Override
    public DataFetcherResult getAsyncData(DataRequest key) {
        List<DataRequest> keys = generateKeys(key);
        final List<Future<DataResponse>> responseList = new ArrayList<>();
        final CountDownLatch latch = new CountDownLatch(keys.size());//assume keys is not null
        for (final DataRequest _key : keys) {
            responseList.add(service.submit(new Callable<DataResponse>() {
                @Override
                public DataResponse call() throws Exception {
                    DataResponse response = null;
                    try {
                        response = performDataRequest(_key);
                    } finally {
                        latch.countDown();
                        return response;
                    }
                }
            }));
        }
        return new DataFetcherResult(responseList, latch);
    }

    // In this method I am making a HTTP call to another service
    // and then I will make List<DataRequest> accordingly.
    private List<DataRequest> generateKeys(DataRequest key) {
        List<DataRequest> keys = new ArrayList<>();
        // use key object which is passed in contructor to make HTTP call to another service
        // and then make List of DataRequest object and return keys.
        return keys;
    }

    private DataResponse performDataRequest(DataRequest key) {
        // This will have all LogicA code here which is shown in my original design.
        // everything as it is same..
        return null;
    }
}

DataFetcherResult.java

public class DataFetcherResult implements Future<List<DataResponse>> {
    final List<Future<DataResponse>> futures;
    final CountDownLatch latch;

    public DataFetcherResult(List<Future<DataResponse>> futures, CountDownLatch latch) {
        this.futures = futures;
        this.latch = latch;
    }

    //non-blocking
    public List<DataResponse> getPartialResult() {
        List<DataResponse> result = new ArrayList<>(futures.size());
        for (Future<DataResponse> future : futures) {
            try {
                result.add(future.isDone() ? future.get() : null);
                //instead of null you can return new DataResponse(DataErrorEnum.NOT_READY, DataStatusEnum.ERROR);
            } catch (InterruptedException | ExecutionException e) {
                e.printStackTrace();
                //ExecutionException or CancellationException could be thrown, especially if DataFetcherResult was cancelled
                //you can handle them here and return DataResponse with corresponding DataErrorEnum and DataStatusEnum
            }
        }
        return result;
    }

    @Override
    public List<DataResponse> get() throws ExecutionException, InterruptedException {
        List<DataResponse> result = new ArrayList<>(futures.size());
        for (Future<DataResponse> future : futures) {
            result.add(future.get());
        }
        return result;
    }

    @Override
    public List<DataResponse> get(long timeout, TimeUnit timeUnit)
            throws ExecutionException, InterruptedException, TimeoutException {
        if (latch.await(timeout, timeUnit)) {
            return get();
        }
        throw new TimeoutException();//or getPartialResult()
    }

    @Override
    public boolean cancel(boolean mayInterruptIfRunning) {
        boolean cancelled = true;
        for (Future<DataResponse> future : futures) {
            cancelled &= future.cancel(mayInterruptIfRunning);
        }
        return cancelled;
    }

    @Override
    public boolean isCancelled() {
        boolean cancelled = true;
        for (Future<DataResponse> future : futures) {
            cancelled &= future.isCancelled();
        }
        return cancelled;
    }

    @Override
    public boolean isDone() {
        boolean done = true;
        for (Future<DataResponse> future : futures) {
            done &= future.isDone();
        }
        return done;
    }

    //and etc.
}

I wrote it with CountDownLatch and it looks great, but note there is a nuance. You can get stuck for a little while in DataFetcherResult.get(long timeout, TimeUnit timeUnit) because CountDownLatch is not synchronized with future's state. And it could happen that latch.getCount() == 0 but not all futures would return future.isDone() == true at the same time. Because they have already passed latch.countDown(); inside finally {} Callable's block but didn't change internal state which is still equals to NEW.
And so calling get() inside get(long timeout, TimeUnit timeUnit) can cause a small delay.
Similar case was described here.

Get with timeout DataFetcherResult.get(...) could be rewritten using futures future.get(long timeout, TimeUnit timeUnit) and you can remove CountDownLatch from a class.

public List<DataResponse> get(long timeout, TimeUnit timeUnit)
        throws ExecutionException, InterruptedException{
    List<DataResponse> result = new ArrayList<>(futures.size());
    long timeoutMs = timeUnit.toMillis(timeout);
    boolean timeout = false;
    for (Future<DataResponse> future : futures) {
        long beforeGet = System.currentTimeMillis();
        try {
            if (!timeout && timeoutMs > 0) {
                result.add(future.get(timeoutMs, TimeUnit.MILLISECONDS));
                timeoutMs -= System.currentTimeMillis() - beforeGet;
            } else {
                if (future.isDone()) {
                    result.add(future.get());
                } else {
                    //result.add(new DataResponse(DataErrorEnum.NOT_READY, DataStatusEnum.ERROR)); ?
                }
            }
        } catch (TimeoutException e) {
            result.add(new DataResponse(DataErrorEnum.TIMEOUT, DataStatusEnum.ERROR));
            timeout = true;
        }
        //you can also handle ExecutionException or CancellationException here
    }

    return result;
}

This code was given as an example and it should be tested before using in production, but seems legit :)

Community
  • 1
  • 1
dezhik
  • 990
  • 10
  • 16