8

I am running Superset via Docker. I enabled the Email Report feature and tried it:

image

However, I only receive the test email report. I don't receive any emails after.

This is my CeleryConfig in superset_config.py:

class CeleryConfig(object):
    BROKER_URL = 'sqla+postgresql://superset:superset@db:5432/superset'
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = 'db+postgresql://superset:superset@db:5432/superset'
    CELERYD_LOG_LEVEL = 'DEBUG'
    CELERYD_PREFETCH_MULTIPLIER = 10
    CELERY_ACKS_LATE = True
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute=1, hour='*'),
        },
    }

The documentation says I need to run the celery worker and beat.

celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4
celery beat --app=superset.tasks.celery_app:app

I added them to the 'docker-compose.yml':

superset-worker:
    build: *superset-build
    command: >
      sh -c "celery worker --app=superset.tasks.celery_app:app -Ofair -f /app/celery_worker.log &&
             celery beat --app=superset.tasks.celery_app:app -f /app/celery_beat.log"
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

Celery Worker is indeed working when sending the first email. The log file is also visible. However, the celery beat seems to not be functioning. There is also no 'celery_beat.log' created.

If you'd like a deeper insight, here's the commit with the full implementation of the functionality.

How do I correctly configure celery beat? How can I debug this?

Snow
  • 1,058
  • 2
  • 19
  • 47

3 Answers3

3

I managed to solve it by altering the CeleryConfig implementation, and adding a beat service to 'docker-compose.yml'

New CeleryConfig class in 'superset_config.py':

REDIS_HOST = get_env_variable("REDIS_HOST")
REDIS_PORT = get_env_variable("REDIS_PORT")

class CeleryConfig(object):
    BROKER_URL = "redis://%s:%s/0" % (REDIS_HOST, REDIS_PORT)
    CELERY_IMPORTS = (
        'superset.sql_lab',
        'superset.tasks',
    )
    CELERY_RESULT_BACKEND = "redis://%s:%s/1" % (REDIS_HOST, REDIS_PORT)
    CELERY_ANNOTATIONS = {
        'sql_lab.get_sql_results': {
            'rate_limit': '100/s',
        },
        'email_reports.send': {
            'rate_limit': '1/s',
            'time_limit': 120,
            'soft_time_limit': 150,
            'ignore_result': True,
        },
    }
    CELERY_TASK_PROTOCOL = 1
    CELERYBEAT_SCHEDULE = {
        'email_reports.schedule_hourly': {
            'task': 'email_reports.schedule_hourly',
            'schedule': crontab(minute='1', hour='*'),
        },
    }

Changes in 'docker-compose.yml':

  superset-worker:
    build: *superset-build
    command: ["celery", "worker", "--app=superset.tasks.celery_app:app", "-Ofair"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes

  superset-beat:
    build: *superset-build
    command: ["celery", "beat", "--app=superset.tasks.celery_app:app", "--pidfile=", "-f", "/app/celery_beat.log"]
    env_file: docker/.env
    restart: unless-stopped
    depends_on: *superset-depends-on
    volumes: *superset-volumes
Snow
  • 1,058
  • 2
  • 19
  • 47
0

I believe Celery needs to run inside your superset container - so you'll need to modify your dockerfile and entrypoint.
BUT you should really first daemonize celery so you don't have to monitor and restart celery [see how to detect failure and auto restart celery worker and http://docs.celeryproject.org/en/latest/userguide/daemonizing.html].
See an example here for how to run a daemonized celery process in docker: Docker - Celery as a daemon - no pidfiles found

David Tobiano
  • 1,188
  • 8
  • 10
0

you can also add -B flag to celery worker command to run beat

celery worker --app=superset.tasks.celery_app:app --pool=prefork -O fair -c 4 -B
Ryabchenko Alexander
  • 10,057
  • 7
  • 56
  • 88
  • that's not recommended in production, when you have multiple workers – Snow May 12 '20 at 11:33
  • It is better then two separate command/processes in docker. I have 1,5 year of -B in prod, all fine. – Ryabchenko Alexander May 12 '20 at 13:46
  • how many workers do you have running? The [docs](https://docs.celeryproject.org/en/latest/userguide/periodic-tasks.html#starting-the-scheduler) say "You can also embed beat inside the worker by enabling the workers -B option, this is convenient if you’ll never run more than one worker node, but it’s not commonly used and for that reason isn’t recommended for production use" – Snow May 12 '20 at 13:50
  • If you have only one worker then fine, but you shouldn't use -B for multiple workers – Snow May 12 '20 at 13:51
  • there were 6 containers with diffirent params for workers and one of them was with -B option and all works fine – Ryabchenko Alexander May 12 '20 at 17:55
  • @RyabchenkoAlexander need help in some mail setup , can you help? – Aditya Verma Dec 15 '20 at 17:19