2

I'm creating a new service, and for that I have database entries (Mongo) that have a state field, which I need to update based on a current time, so, for instance, the start time was set to two hours from now, I need to change state from CREATED -> STARTED in database, and there can be multiple such states.

Approaches I've thought of:

  1. Keep querying database entries that are <= current time and then change their states accordingly. This causes extra reads for no reason and half the time empty reads, and it will get complicated fast with more states coming in.

  2. I write a job scheduler (I am using go, so that'd be not so hard), and schedule all the jobs, but I might lose queue data in case of a panic/crash.

  3. I use some products like celery, have found a go implementation for it https://github.com/gocelery/gocelery

Another task scheduler I've found is on Google Cloud https://cloud.google.com/solutions/reliable-task-scheduling-compute-engine, but I don't want to get stuck in proprietary technologies.

I wanted to use some PubSub service for this, but I couldn't find one that has delayed messages (if that's a thing). My problem is mainly not being able to find an actual name for this problem, to be able to search for it properly, I've even tried searching Microsoft docs. If someone can point me in the right direction or if any of the approaches I've written are the ones I should use, please let me know, that would be a great help!

UPDATE: Found one more solution by Netflix, for the same problem https://medium.com/netflix-techblog/distributed-delay-queues-based-on-dynomite-6b31eca37fbc

ishaan
  • 1,951
  • 1
  • 19
  • 31

1 Answers1

1

I think you are right in that the problem you are trying to solve is the job or task scheduling problem.

One approach that many companies use is the system you are proposing: jobs are inserted into a datastore with a time to execute at and then that datastore can be polled for jobs to be run. There are optimizations that prevent extra reads like polling the database at a regular interval and using exponential back-off. The advantage of this system is that it is tolerant to node failure and the disadvantage is added complexity to the system.

Looking around, in addition to the one you linked (https://github.com/gocelery/gocelery) there are other implementations of this model (https://github.com/ajvb/kala or https://github.com/rakanalh/scheduler were ones I found after a quick search).

The other approach you described "schedule jobs in process" is very simple in go because goroutines which are parked are extremely cheap. It's simple to just spawn a goroutine for your work cheaply. This is simple but the downside is that if the process dies, the job is lost.

go func() {
    <-time.After(expirationTime.Sub(time.Now()))

    // do work here.
}()

A final approach that I have seen but wouldn't recommend is the callback model (something like https://gitlab.com/andreynech/dsched). This is where your service calls to another service (over http, grpc, etc.) and schedules a callback for a specific time. The advantage is that if you have multiple services in different languages, they can use the same scheduler.

Overall, before you decide on a solution, I would consider some trade-offs:

  • How acceptable is job loss? If it's ok that some jobs are lost a small percentage of the time, maybe an in-process solution is acceptable.
  • How long will jobs be waiting? If it's longer than the shutdown period of your host, maybe a datastore based solution is better.
  • Will you need to distribute job load across multiple machines? If you need to distribute the load, sharding and scheduling are tricky things and you might want to consider using a more off-the-shelf solution.

Good luck! Hope that helps.

dolan
  • 1,716
  • 11
  • 22
  • Thanks for the answer, so far the first polling approach only seems like its gonna work for me. I'll try with static time intervals, and then take it ahead. I cannot afford to lose job data and I don't quite need to distribute jobs across multiple machines, but its good to have and the DB approach will be able to horizontally scale. I think I'll use a separate DB instance for this than the one serving the app, and only update the original DB once I have objects to update. And thanks for the other options to gocelery. I'll definitely take a look at them. :) – ishaan Sep 10 '18 at 06:28