How to achieve a more complicated Producer-Consumer Scenario

Question

First, Assuming a scene like this:

  { Queue1: [taskA1, taskA2, taskA3, ... ] }
  { Queue2: [taskB1, taskB2, ... ] }
  ...
  { QueueN: [taskN1, ... ] }

These queues contains varies number of tasks, where tasks are dynamically pushed.

Secondly, we have a list of users:

  [ user1, user2, user3, ... ]

Different users have different permission for queues, for example:

  { user1: { Queue1, Queue2, ... } }
  { user2: { Queue1 } }
  { ... }

Thirdly, There are conditions:

Queues will have priority, it means Queue1's task must be done before Queue2's task.
Some queues may be empty since there is no tasks currently
Priority Queue is not available, because every user wants as much tasks as he can
System memory and System performance can be considered ample enough, and we use Gemfire
So multi-threads are not available
Thousands of Users, Hundreds of Queues

This is a real problem we met, when our system first went online, fetching a task takes no more than 50ms. But with the users and queues grows, system response time has increased to hundreds of milliseconds. We tried to optimize with BitSet, but only 50% of performance was improved.

So maybe StackOverflow will have the ultimate answer.

Are you looking for generic solution or Gemfire specific solution? If latter, then you should put gemfire tag in and maybe remove 'algorithm' as it will be probably very technical, rather than generic answer. — Artur Biesiadowski, Dec 15 '16 at 14:06
Add some code and implementation details. Thousands of users and hundreds of queues should work with good response times even with the most naïve implementation. So there must a bottleneck somewhere. — Serg M Ten, Dec 15 '16 at 14:34
@ArturBiesiadowski A general answer would be fine. Gemfire was mentioned because I want to let us know that a high-performance cache is available. — Nappp, Dec 16 '16 at 01:07
1) How long execution of task takes? Is it very variable from task to task, or they are similar on average? 2) Is task data fully self-contained, or does it depend on some external resources which have to be fetched/accessed? 3) What is expected latency between machines (sub-ms, tens of ms, hundreds of ms?) 4) Is requirement of completing task from q1 before q2 hard requirement (system will fail if q2 finishes earlier, so global sequence is required) or it should be just best effort (q1 should generally have higger priority than q1)? — Artur Biesiadowski, Dec 16 '16 at 09:21
@SergioMontoro The code was unable to be accessed outside our company's machines. But the general idea is: 1) cache a user's queue when the user logins. 2) iterate over the queue. I know it looks silly but it is the best they can do for users' competition — Nappp, Dec 16 '16 at 14:33
@ArturBiesiadowski 1) Tasks may be very variable, they are very different kind of tasks, maybe I should split it up sometime. 2) Task will have a Task ID, you got the ID, you got a task. 3) The environment is a fully intranet, and network latency is very insignificant, we also use IBM machines, so we are hoping for 50ms or so. 4) Task Priority is more like a dependency. You complete component A from Queue A, component B from Queue C, then Queue C can be done, but now we are pushing C task after AB is done, so the system won't fail. It would be just best effort. — Nappp, Dec 16 '16 at 14:44

How to achieve a more complicated Producer-Consumer Scenario

0 Answers0