2

I have been researching how to efficiently solve the following use case and I am struggling to find the best solution.

Basically I have a Node.js REST API which handles requests for users from a mobile application. We want some requests to launch background tasks outside of the req/res flow because they are CPU intensive or might just take a while to execute. We are trying to implement or use any existing frameworks which are able to handle different job queues in the following way (or at least compatible with the use case):

  • Every user has their own set job queues (there are different kind of jobs).
  • The jobs within one specific queue have to be executed sequentially and only one job at a time but everything else can be executed in parallel (it would be preferable if there are no queues hogging the workers or whatever is actually consuming the tasks so all queues get more or less the same priority).
  • Some queues might fill up with hundreds of tasks at a given time but most likely they will be empty a lot of the time.
  • Queues need to be persistent.

We currently have a solution with RabbitMQ with one queue for every kind of task which all the users share. The users dump tasks into the same queues which results in them filling up with tasks from a specific user for a long time and having the rest of users wait for those tasks to be done before their own start to be consumed. We have looked into priority queues but we don't think that's the way to go for our own use case.

The first somewhat logical solution we thought of is to create temporary queues whenever a user needs to run background jobs and have them be deleted when empty. Nevertheless we are not sure if having that many queues is scalable and we are also struggling with dynamically creating RabbitMQ queues, exchanges, etc (we have even read somewhere that it might be an anti-pattern?).

We have been doing some more research and maybe the way to go would be with other stuff such as Kafka or Redis based stuff like BullMQ or similar.

What would you recommend?

Bernat Felip
  • 323
  • 5
  • 18
  • Is there any chance that you could use a task table in a database and with a poll based application you can perform the processing (maintain task state machines as well group by different tasks as well). That is also a choice you could explore if the case suits – Ramachandran.A.G Jun 15 '22 at 14:04
  • We actually already control the state of the tasks with some persistent tables but polling on the database would put a strain on the database. – Bernat Felip Jun 17 '22 at 07:30
  • 1
    Ah! I see. Ironically , there was something along similar lines that made headlines on HN yesterday. It is an interesting read. Neither endorse it or disapprove of it , the answer in design is dependent on what is the DB , how long lived are the jobs , how the polling works etc. Very subjective and never an answer though :) https://www.scylladb.com/2022/06/14/how-palo-alto-networks-replaced-kafka-with-scylladb-for-stream-processing/ – Ramachandran.A.G Jun 17 '22 at 08:39

2 Answers2

1

If you're on AWS, have you considered SQS? There is no limit on number of standard queues created, and in flight messages can reach up to 120k. This would seem to satisfy your requirements above.

Verbal_Kint
  • 1,366
  • 3
  • 19
  • 35
  • I am using GCP so this is not really an option. I see that "Cloud Tasks" would be the way to go for this use case but they do have a 1.000 queue limit, does not guarantee an order and queue re-creation is troublesome so I don't know if I could work with that either. – Bernat Felip Jun 14 '22 at 10:21
  • 1
    @BernatFelip you can easily call SQS from any server (GCP or otherwise) using the aws-sdk for your preferred language – Verbal_Kint Jun 14 '22 at 15:40
0

While the mentioned SQS solution did prove to be very scalable our amount of polling we would need to do or use of SNS did not make the solution optimal. On the other hand implementing a self made solution via database polling was too much for our use case and we did not have the time or computational resources to consider a new database in our stack.

Luckily, we ended up finding that the Pro version of BullMQ does have a "Group" functionality which performs a round robin strategy for different tasks within a single queue. This ended up adjusting perfectly to our use case and is what we ended up using.

Bernat Felip
  • 323
  • 5
  • 18