0

I have a method which takes a string as input and returns data from the DB based on the input string. I have an array of strings and I am currently passing each string as input and looping over the entire array

public DataClass getData(String input){
  ....logic to get the data when string=input from a third party API. 
       Third party API takes 'input' string and gives out data....
}

public void callerMethod() {
  List<String> myStrings = new List<String>();
  for(inputStr : myStrings) {
       DataClass data = getData(inputStr);
  }
}

Above code is the logic I have as of now. I want to change the getData() method calls to concurrent calls instead of looping through the list one after another as this approach is time taking. I am not sure if I can use threads here or if there is any newer approach to achieve this.

user811433
  • 3,999
  • 13
  • 53
  • 76
  • 3
    If you're reading from a DB I can guarantee you're IO-bound. Parallelising IO over a single channel won't help. You should consider moving the filtering logic into the DB query. – millimoose Sep 25 '13 at 22:56
  • 1
    Though I have other areas like this with DB IO, in the current case, I am making an API call to a third party API. Editing the question to include this info – user811433 Sep 25 '13 at 23:01
  • @millimoose Can you? There are plenty of cases where you can do a CPU bound action based on a string, or do other IO bound tasks. – Benjamin Gruenbaum Sep 25 '13 at 23:02
  • @user811433 What about a concurrent queue and workers using the producer consumer pattern? – Benjamin Gruenbaum Sep 25 '13 at 23:02
  • @BenjaminGruenbaum Okay, so "guarantee" was too strong. I'd still bet pretty good money on it though. – millimoose Sep 25 '13 at 23:02
  • 1
    Anyway, my first approach would be [`ExecutorService.submit()`](http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html#submit(java.util.concurrent.Callable)) (or `invokeAll()`) and fetching the data from the returned `Future`s. That's assuming the third-party API can be used in a thread-safe way. – millimoose Sep 25 '13 at 23:04
  • @Benjamin Gruenbaum - I am unsure as to which part can use the concurrent queue. I was looking for a way where I can replace the for loop in CallerMethod() with some concurrency logic so that all needed calls to getData() go simultaneously. – user811433 Sep 25 '13 at 23:05
  • @millimoose - yes the third party API is thread safe. I did not use ExecutorService before. I will try to find some examples for that. – user811433 Sep 25 '13 at 23:06
  • 1
    @user811433 If only you were using C# that would just have been just `Parallel.ForAll` that would take care of that for you, Java 8 provides a similar interface with the new streams API. Unfortunately for you, In Java 7 - you need 2 classes that implement Runnable, one for a worker (consumes from a ConcurrentLinkedQueue) (have like 8 of these or whatever number of cores you have) and one for the DB producer that takes the DB and just produces for it. millimoose's suggestion with Futures is probably a better approach if you want anything more. – Benjamin Gruenbaum Sep 25 '13 at 23:07
  • @user811433 The example is like right there in that javadoc file. – millimoose Sep 25 '13 at 23:09
  • @Benjamin Gruenbaum - my List will go into the ConcurrentLinkedQueue? and getData() should turn to a producer and callerMethod() into a consumer? – user811433 Sep 25 '13 at 23:10
  • @millimoose - I am looking at http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html – user811433 Sep 25 '13 at 23:11
  • @user811433 "Usage examples", top of the file? – millimoose Sep 25 '13 at 23:30
  • @millimoose - yes, I went through those. Thank you. – user811433 Sep 25 '13 at 23:33

1 Answers1

2

This can be parallelized using the Executor framework. Create a ThreadPoolExecutor. The number of threads should probably be equal to the number of concurrent connections you can have to the database (i.e. connection pool size).

Loop through your strings. For each string, create a Callable that wraps getData and submit the callable to the executor. The executor will return a Future which you can use later. Once you have submitted all of the callables, you can start retrieving the DataClasses from your Futures.

Carlos Macasaet
  • 1,176
  • 7
  • 23
  • One question - the future.get() does not return me any data though. Is it supposed to return values? – user811433 Sep 26 '13 at 15:19
  • @user811433, future.get() will return the same thing as callable.call(). The difference is that if you invoke call() yourself, it will execute immediately. However, if you submit the callable to an executor, then it may be invoked sometime in the future. The future allows you to get the value of the deferred call. If the value is not available yet, it will block until it is. – Carlos Macasaet Oct 03 '13 at 06:14