Assume that I have a set of objects that need to be analyzed in two different ways, both of which take relatively long time and involve IO-calls, I am trying to figure out how/if I could go about optimizing this part of my software, especially utilizing the multiple processors (the machine i am sitting on for ex is a 8-core i7 which almost never goes above 10% load during execution).
I am quite new to parallel-programming or multi-threading (not sure what the right term is), so I have read some of the prior questions, particularly paying attention to highly voted and informative answers. I am also in the process of going through the Oracle/Sun tutorial on concurrency.
Here's what I thought out so far;
- A thread-safe collection holds the objects to be analyzed
- As soon as there are objects in the collection (they come a couple at a time from a series of queries), a thread per object is started
- Each specific thread takes care of the initial pre-analysis preparations; and then calls on the analyses.
- The two analyses are implemented as Runnables/Callables, and thus called on by the thread when necessary.
And my questions are:
- Is this a reasonable scheme, if not, how would you go about doing this?
- In order to make sure things don't get out of hand, should I implement a ThreadManager or some thing of that sort, which starts and stops threads, and re-distributes them when they are complete? For example, if i have 256 objects to be analyzed, and 16 threads in total, the ThreadManager assigns the first finished thread to the 17th object to be analyzed etc.
- Is there a dramatic difference between Runnable/Callable other than the fact that Callable can return a result? Otherwise should I try to implement my own interface, in that case why?
Thanks,