4

I am using celery and celery beat to handle task execution and scheduled tasks in a Python project. I am not using django.

Execution of celery tasks is working as expected. However I have run into a wall trying to get scheduled tasks (celery beat) to run.

I have followed the celery documentation to add my task to app.conf.beat_schedule successfully. If I print out the beat schedule after adding my task, I can see that the task has been added to app.conf.beat_schedule successfully.

from celery import Celery
from celery.task import task
# Celery init
app = Celery('tasks', broker='pyamqp://guest@localhost//')

# get the latest device reading from the appropriate provider
@app.task(bind=True, retry_backoff=True)
def get_reading(self, provider, path, device, config, location, callback):
    logger.info("get_reading() called")
    module = importlib.import_module('modules.%s' % provider)
    try:
        module.get_reading(path, device, config, location, callback)
    except Exception as e:
        self.retry(exc=e)

# add the periodic task
def add_get_reading_periodic_task(provider, path, device, config, location, callback, interval = 600.0):
    app.conf.beat_schedule = {
        "poll-provider": {
        "task": "get_reading",
        "schedule": interval,
        "args": (provider, path, device, config, location, callback)
        }
    }

    logger.info(app.conf.beat_schedule)
    logger.info("Added task 'poll-provider' for %s to beat schedule" % provider)

Looking at my application log, I can see that app.conf.beat_schedule has been updated with the data passed to add_get_reading_periodic_task():

2017-08-17 11:07:13,216 - gateway - INFO - {'poll-provider': {'task': 'get_reading', 'schedule': 10, 'args': ('provider1', '/opt/provider1', None, {'location': {'lan.local': {'uri': 'http://192.168.1.10'}}}, 'lan.local', {'url': 'http://localhost:8080', 'token': '*******'})}}
2017-08-17 11:07:13,216 - gateway - INFO - Added task 'poll-provider' for provider1 to beat schedule

I'm manually running celery worker and celery beat simultaneously (in different terminal windows) on the same application file:

$ celery worker -A gateway --loglevel=INFO
$ celery beat -A gateway --loglevel=DEBUG

If I call get_reading.delay(...) within my application, it is executed by the celery worker as expected.

However, the celery beat process never shows any indication that the scheduled task is registered:

celery beat v4.0.2 (latentcall) is starting.
__    -    ... __   -        _
LocalTime -> 2017-08-17 11:05:15
Configuration ->
    . broker -> amqp://guest:**@localhost:5672//
    . loader -> celery.loaders.app.AppLoader
    . scheduler -> celery.beat.PersistentScheduler
    . db -> celerybeat-schedule
    . logfile -> [stderr]@%DEBUG
    . maxinterval -> 5.00 minutes (300s)
[2017-08-17 11:05:15,228: DEBUG/MainProcess] Setting default socket timeout to 30
[2017-08-17 11:05:15,228: INFO/MainProcess] beat: Starting...
[2017-08-17 11:05:15,248: DEBUG/MainProcess] Current schedule:
<ScheduleEntry: celery.backend_cleanup celery.backend_cleanup() <crontab: 0 4 * * * (m/h/d/dM/MY)>
[2017-08-17 11:05:15,248: DEBUG/MainProcess] beat: Ticking with max interval->5.00 minutes
[2017-08-17 11:05:15,250: DEBUG/MainProcess] beat: Waking up in 5.00 minutes.
[2017-08-17 11:10:15,351: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-08-17 11:10:15,355: DEBUG/MainProcess] beat: Waking up in 5.00 minutes.
[2017-08-17 11:15:15,400: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-08-17 11:15:15,402: DEBUG/MainProcess] beat: Waking up in 5.00 minutes.
[2017-08-17 11:20:15,502: DEBUG/MainProcess] beat: Synchronizing schedule...
[2017-08-17 11:20:15,504: DEBUG/MainProcess] beat: Waking up in 5.00 minutes.

This is seemingly confirmed by running celery inspect scheduled:

-> celery@localhost.lan: OK
- empty -

I have tried starting celery beat both before and after adding the scheduled task to app.conf.beat_schedule, and in both cases the scheduled task never appears in celery beat.

I read that celery beat did not support dynamic reloading of the configuration until version 4, but I am running celery beat 4.0.2

What am I doing wrong here? Why isn't celery beat showing my scheduled task?

hwmartin
  • 157
  • 1
  • 12
  • I am having a similar issue. If i get the answer i will let you know. – Ouss Aug 18 '17 at 17:17
  • Are you on a Mac? – Ouss Aug 18 '17 at 17:48
  • No, this is on Ubuntu 17.04 – hwmartin Aug 21 '17 at 07:55
  • Found this thread: https://stackoverflow.com/questions/41124591/setting-up-periodic-tasks-in-celery-celerybeat-dynamically-using-add-periodic It seems you need a different Scheduler class! – Ouss Aug 22 '17 at 09:53
  • Maybe you could get some inspiration from this GitHub repository: https://github.com/zmap/celerybeat-mongo – Ouss Aug 22 '17 at 09:54
  • Why would I want to use MongoDB when I already have RabbitMQ working for scheduling tasks? My issue is not with the celery back end, it is that celery-beat does not implement the functionality I'm looking for. – hwmartin Oct 28 '17 at 10:01
  • The other stackoverflow answer is about _dynamically managing periodic tasks_ but I want to **dynamically create** periodic tasks. – hwmartin Oct 28 '17 at 10:13

2 Answers2

0

Have you tried using the code as described in the Documentation:

@app.on_after_configure.connect
def setup_periodic_tasks(sender, **kwargs):
    # Calls test('hello') every 10 seconds.
    sender.add_periodic_task(10.0, test.s('hello'), name='add every 10')

?

Ouss
  • 2,912
  • 2
  • 25
  • 45
  • 1
    Yes, I have followed the documentation (as I state in my question) and I have no problem to follow the examples in the celery documentation and have a working celery beat. However the examples are not helpful to my use case as I want to **dynamically create celery beat tasks**. The celery examples (such as the one you've posted) are statically defined tasks, which is not my use case. – hwmartin Aug 22 '17 at 07:30
  • Celery beat stores the dynamically added tasks in an off-the-shelve database (basically a file in your project folder)... If the celery user didn't have the permission to write to that file tasks will not be added dynamically. – Ouss Aug 22 '17 at 07:46
  • The user running celery beat has write access to pwd. When I run celery the _celerybeat-schedule_ file is created. This is not a permissions issue. – hwmartin Aug 22 '17 at 07:47
  • So... Any luck so far? – Ouss Aug 24 '17 at 15:35
  • I determined that the functionality I wanted is not supported by celery in its current form. I misinterpreted the beat scheduler features from the documentation. – hwmartin Oct 28 '17 at 09:58
  • Your code snippet is from the first link in my question. While I appreciate your effort, I am already aware of this example and it works as expected because it's not a dynamically created task. – hwmartin Oct 28 '17 at 10:05
  • This is pretty old but you can add tasks dynamically with celery beat – safhac Jan 06 '19 at 15:13
  • @safhac if you have working code to dynamically add a celery beat task, please post it as an answer – hwmartin Nov 14 '19 at 10:59
  • @safhac please do – superandrew Feb 15 '20 at 21:56
  • Hey @supradr i've posted the code above. please check and see which version of python/celery you're using? i'm using python 3.6 some features of asyncio are only fully implemented in >= 3.7 – safhac Feb 16 '20 at 10:40
0

create a model for creating periodic tasks:

class TaskScheduler(models.Model):

periodic_task = models.OneToOneField(
    PeriodicTask, on_delete=models.CASCADE, blank=True, null=True)

node = models.OneToOneField(
    Node, on_delete=models.CASCADE, blank=True, null=True)

@staticmethod
def schedule_every(task_name, period, every, sheduler_name, args, queue=None):
    # schedules a task by name every "every" "period". So an example call would be:
    # TaskScheduler('mycustomtask', 'seconds', 30, [1,2,3])
    # that would schedule your custom task to run every 30 seconds with the arguments 1,2 and 3 passed to the actual task.
    permissible_periods = ['days', 'hours', 'minutes', 'seconds']
    if period not in permissible_periods:
        raise Exception('Invalid period specified')
    # create the periodic task and the interval
    # create some name for the period task
    ptask_name = sheduler_name
    interval_schedules = IntervalSchedule.objects.filter(
        period=period, every=every)
    if interval_schedules:  # just check if interval schedules exist like that already and reuse em
        interval_schedule = interval_schedules[0]
    else:  # create a brand new interval schedule
        interval_schedule = IntervalSchedule()
        interval_schedule.every = every  # should check to make sure this is a positive int
        interval_schedule.period = period
        interval_schedule.save()

    if(queue):
        ptask = PeriodicTask(name=ptask_name, task=task_name,
                             interval=interval_schedule, queue=queue)
    else:
        ptask = PeriodicTask(name=ptask_name, task=task_name,
                             interval=interval_schedule)

    if(args):
        ptask.args = args
    ptask.save()
    return TaskScheduler.objects.create(periodic_task=ptask)

@staticmethod
def schedule_cron(task_name, at, sheduler_name, args):
    # schedules a task by name every day at the @at time.
    # create some name for the period task
    ptask_name = sheduler_name

    crons = CrontabSchedule.objects.filter(
        hour=at.hour, minute=at.minute)
    if crons:  # just check if CrontabSchedule exist like that already and reuse em
        cron = crons[0]
    else:  # create a brand new CrontabSchedule
        cron = CrontabSchedule()
        cron.hour = at.hour
        cron.minute = at.minute
        cron.save()
    ptask = PeriodicTask(name=ptask_name,
                         crontab=cron, task=task_name)

    if(args):
        ptask.args = args
    ptask.save()
    return TaskScheduler.objects.create(periodic_task=ptask)

def stop(self):
    """ pauses the task """
    ptask = self.periodic_task
    ptask.enabled = False
    ptask.save()

def start(self):
    ptask = self.periodic_task
    ptask.enabled = True
    ptask.save()

def terminate(self):
    self.stop()
    ptask = self.periodic_task
    PeriodicTask.objects.get(name=ptask.name).delete()
    self.delete()
    ptask.delete()

and then from you code

# give the periodic task a unique name
        scheduler_name = "%s_%s" % ('node_heartbeat', str(node.pk))
        # organize the task arguments
        args = json.dumps([node.extern_id, node.pk])
        # create the periodic task in heartbeat queue
        task_scheduler = TaskScheduler().schedule_every(
            'core.tasks.pdb_heartbeat', 'seconds', 15, scheduler_name , args, 'heartbeat')

        task_scheduler.node = node
        task_scheduler.save()

the original class I came across here a long time ago I've added schedule_cron

safhac
  • 1,141
  • 8
  • 14
  • original code link https://groups.google.com/forum/#!msg/celery-users/CZXCh8sCK5Q/ihZgMV2HWWYJ – safhac Nov 17 '19 at 08:17