I have a RESTful-styled RPC (remote procedure call) API running on a tomcat server that processes data of N users with M tasks on K threads. Mostly one user has around 20 to 500 tasks (but M could be between 1 to 5000). One task needs around 10 to 20 seconds to complete, but can be between 1 second and 20 minutes. Currently, mostly the system has one user, sometimes up to three, but it increases to around 10 users at the same time in the near future. Our server has 10 cores, therefore I'd like to use 10 threads. At the moment every user gets 5 threads for processing, which works fine. But a) most of the time the machine is only utilized 50% (which results in needles waiting in the "30-minute" range), sometimes the server load is up to 150%.
Requirements to solution:
- at all times the server is utilized to 100% (if there are tasks)
- that all users are treated the same regarding thread execution (same amount of threads finished as every other user)
- a new user does not have to wait until all tasks of a earlier user are done (especially in the case where user1 has 5000 tasks and user2 has 1 this is important)
Solutions that come to mind:
just use a FixedThreadPoolExecutor with 10 threads, violates condition 3
use the PriorityBlockingQueue and implement the compareTo method in my task -> can not use the threadpoolExecutors submit method (and therefore I do not know when a submitted task is over)
implement a "round robin" like blocking queue, where the K threads (in our case 10) take new tasks from the N internal queues in a round robin way -> to be able to put a task into the right queue, I need a "submit"-method that takes more than one parameter (I need to implement a ThreadPoolExecutor, too)
I tried to make an illustration of what I mean by round robin like blocking queue (if not helpful feel free to edit it out):
-- -- -- -- -- -- queue task load, -- -- -- -- -- -- -- one task denoted by -- -- -- -- -- -- -- -- -- | Q1 | Q2 | Q3 | Q4 | Q5 | Q6 | Q7 | QN | | * ^ | | last| |next | | ------------- \ / \ | | | | | | T1 | T2 | T3 | T4 | TK |
Is there an elegant solution to use mostly Java standard APIs (or any other widespread Java API) for achieving this kind of processing behavior (might it be one of my proposed solutions or any another solution)? Or do you have any other hints on how to tackle this issue?