0

I have jobsnamed A, B, C, D. Job B has to start after job A finished. So the order of jobs should look like this A->B->C->D.

I want to scale number of workers for A, B, C and D indepently. Is there a way to implement this using RabbitMQ, I am basically looking for a way to create a series of jobs.

My current design looks like this:

  1. The caller process creates seriesOfJobs: array of JSONs that describe jobs A,B,C,D using JSON-RPC protocol
  2. The caller sends the seriesOfJobs to a seriesManager(separate process) via RabbitMQ RPC and awaits callback on mainCallbackQueue
  3. The seriesManager parses seriesOfJobs sends job A to aworkerA(separate process) via RabbitMQ RPC and awaits callback on callbackQueueA
  4. workerA performs job A and notifies seriesManager via callbackQueueA
  5. seriesManager gets callback from callbackQueueA and sends job B to worker and awaits callback, then the same for job C, then the same for job D

  6. seriesManager knows that it jobs A,B,C,D finished - it notifies caller via mainCallbackQueue

I am using the concept of RPC as described here RabbitMQ RPC tutorial Is there a simpler way to do this?

Jan Grz
  • 1,373
  • 14
  • 18

1 Answers1

1

(Unfortunately I don't have enough reputation to comment, so this may be a somewhat lacking answer as I can't clarify requirements, though I'll try and edit it to stick to what's needed)

Is there an absolute need for the seriesManager to be present?

It may be more logical to have workerA create job B for workerB and so on and so forth rather than consistently calling back to a central hub.

In which case your current design would change to:

  1. caller creates seriesOfJobs.
  2. caller sends seriesOfJobs to workerA.
  3. workerA performs job A and sends remaining seriesOfJobs to workerB.
  4. workerB performs job B and sends remaining seriesOfJobs to workerC.
  5. workerC performs job C and sends remaining seriesOfJobs to workerD.
  6. workerD performs job D and notifies caller via mainCallbackQueue.

I would regard that a "simpler way" seeing as it takes that nasty central hub out of the equation.

  • The design you suggested is similar to the current one(in the project I am working on). The central hub approach has two advantages for me: 1) all the `smarts` are in one place, `workers` are just dumb `rpc` request/response endpoints 2) If job `B` fails the central hub can abort jobs `C/D` and notify the `caller` – Jan Grz Jun 15 '16 at 14:58
  • I can export all the `smarts` to a library that `workers` will use...I am not experienced enough to tell which approach is better :) – Jan Grz Jun 15 '16 at 15:03
  • Also the central hub enables not only `series` but also `parallel` - but I admit it introduces a lot of complexity (additional queues for callbacks and so on..) – Jan Grz Jun 15 '16 at 15:11
  • Workers being dumb is usually a good choice as, more than anything, it makes them nice and reusable, though I'd see the extra complication in their functions to be a good trade off for removing the `seriesManager`. Personally, however, I can't see a simpler way to achieve what you want without the removal of that, which as you say can provide some benefits. – Jack Williams Jun 15 '16 at 15:44
  • I realised that with `seriesOfJobs` and callback, a smart `caller` can implement also `parallel` and other async patterns. I am getting more and more convinced of this `smart-worker` approach – Jan Grz Jun 16 '16 at 07:43