Celery doesn't support this out-of-the-box, but I have had to do similar stuff in the past and I had to stumble through a solution on my own.
In my experience, there are two somewhat straightforward ways to achieve this, both with trade-offs. There are also some pretty big holes you can step into with this stuff so caveat emptor.
Option 1:
Use some datastore to save information about when the task should be run and to trigger the celery beat task.
To do this, you could use your db and a model that holds some information about the periodic task. (If you wanted to get more get tech, you could even talk to queue directly and skip the models route, probably as well.)
from django.db import models
class PeriodicTask(models.Model):
lastrun = models.DateTimeField()
nextrun = models.DateTimeField()
notes = models.TextField() # errors?
task_id = models.CharField(max_length=100)
That's just kind of a rough idea of what the model might store. You can put whatever is useful on there, but you'll need a datetime object to store when the next run should be.
Next, your periodic task needs to run more frequently in order to spin up and see if there are any tasks that need to be executed soon:
import datetime
from .models import PeriodicTask
@periodic_task(run_every=timedelta(minutes=2), queue='activities', options={'queue': 'activities'})
def pull_activities_frequent_adaptors():
now = datetime.datetime.utcnow() # need to be clear about time-zones
scheduled_tasks = PeriodicTask.objects.filter(nextrun__gte=now)
if scheduled_tasks and scheduled_tasks.count() == 1: # more than one and we've erred somewhere
timewindow = datetime.timedelta(minutes=5)
if (scheduled_tasks[0].nextrun - now) <= timewindow:
scheduled_tasks[0].delete()
# Do the task
# schedule the next one
PeriodicTask.objects.create(
lastrun=now,
nextrun=now + datetime.timedelta(minutes=30))
Potential issues:
1) If you have multiple databases with a master-slave setup and, in particular if you have lag, you could end up double scheduling stuff (even with the count() == 1
part). Thus, there's a race condition which is worth thinking about.
2) It's hard to get close to exactly 30 minutes because you have to use a time window to find tasks to execute.
3) The task needs to run more often than your time window otherwise you could miss it. This is potentially a waste of resources (but not too terrible of one, I suppose) because it's usually spinning up and doing nothing.
4) Nothing boggles the mind more than dealing with datetimes, so you have to really consider the time zone thing and think about all the variations and test the hell out of this code.
5) This one is a biggy: if the task takes longer to run than the interval it's scheduled for, then you'll have two tasks running concurrently, which is a problem. Again, with race conditions, things can get dicey.
Option 2)
Don't use celery beat: fire off the first task and have it fire off another one after 30 minutes. This has the potential to become a runaway sorcerer's apprentice-type of thing, so I find it to be a bit, um, scary, and while I have done the first option, I've never really talked myself into the following. But, anyway, I think it could be done:
@task # no longer a periodic task
def your_task(args):
# Whatever you want to do, then call itself again...
your_task.apply_async(args=(args), countdown=1800)
Now you just need to call this somewhere, probably in a cron job that spins up once a week and kills any previous versions of this thing (how does it find them?) and then fires off the first one.
I have to say that I don't really like this answer and even though it's occurred to me a handful of times, it seems like a more dangerous and unruly way to address the problem. I'd be curious if anyone does it, though.