42

Celery defaults to using pickle as its serialization method for tasks. As noted in the FAQ, this represents a security hole. Celery allows you to configure how tasks get serialized using the CELERY_TASK_SERIALIZER configuration parameter.

But this doesn't solve the security problem. Even if tasks are serialized with JSON or similar, the workers will still execute tasks inserted into the queue with pickle serialization -- they just respond to the content-type parameter in the message. So anybody who can write to the task queue can effectively pown the worker processes by writing malicious pickled objects.

How can I prevent the worker threads from running tasks serialized with pickle?

Leopd
  • 41,333
  • 31
  • 129
  • 167
  • 2
    Another approach (which I haven't tried) is of course to ensure that only trusted clients can push to the task queue. Which is probably a good idea anyway. – Thomas K Jul 08 '11 at 18:19
  • absolute newbie to celery - not sure i get the point of serialization. can anyone please enlighten me on the need for serialization or any documentaion in that direction ? – Shankar ARUL Aug 26 '15 at 09:31
  • 1
    @sarul there is no shared memory between the process which enqueues the task and the worker process which runs the task... they may be on separate servers. so the python objects that you send as task args have to be serialized somehow for transmission between them. and also, the message queue (eg RabbitMQ) that serves as the transport deals in text messages only – Anentropic Oct 13 '15 at 16:06

3 Answers3

63

I was getting "ContentDisallowed: Refusing to deserialize untrusted content of type pickle (application/x-python-serialize)"

having:

CELERY_ACCEPT_CONTENT = ['json']

wasn't enough... I had to also add the followings to settings:

CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
naoko
  • 5,064
  • 4
  • 35
  • 28
10

I got an answer from the celery-users mailing list (From Ask Solem to be specific). Add these two lines to the config (celeryconfig/settings):

from kombu import serialization
serialization.registry._decoders.pop("application/x-python-serialize")
Leopd
  • 41,333
  • 31
  • 129
  • 167
  • 2
    Using protected methods is usually a bad idea. I appreciate this may not have been the case at the time this answer was proposed, but it's certainly not the case now. – Lloyd Moore Jan 29 '16 at 15:39
6

Now that Celery supports configuration on a per-app basis, there is a cleaner way to restrict the content that a consumer will execute.

c = celery.Celery()
c.conf.update(CELERY_ACCEPT_CONTENT = ['json'])

See the Celery docs on security for details, and for more advanced security options, such as signing content.

sirdodger
  • 1,044
  • 11
  • 7