6

Trying to more fully understand how Celery/Kombu and Redis interact under the hood to better project scaling and cost of equipment (particularly in dev environments where I'd like the smallest Redis setup possible and thus the fewest connections).

I've failed to find the specific documentation I'm looking for on this subject in the Celery or Kombu User Guide's.

All of the following assumptions and questions are based on tests I've run using a Celery app with no tasks (just app = Celery("DatabaseWorker", broker=redis_uri)), that then just sits there and does nothing, and using the redis-cli tool (using the monitor option for real time updates). I spin up a single worker of this type and get my 8 open connections.

Looking for answers or corrections of my misinformed assumptions:

  1. Why do all 8 connections remain open? Does this have to do with the fact that each connection is from a pool and thus never closes?

  2. If it is pool related then why does setting values in celeryconfig.py such as the following still result in 8 connections (neither option seems to work)

    BROKER_TRANSPORT_OPTIONS = {
        max_connections': 5,
    }
    BROKER_POOL_LIMIT = 5
    
  3. Two (2) of the connections appear to be associated with celeryev (for monitoring tools). One for publishing messages (issuing PUBLISH commands) and one subscribing (PSUBSCRIBE). I can turn off publishing using --without-hearbeat (oldly not using CELERY_SEND_EVENTS config var) saving a connection. Can I also prevent the subscription connection? In a development enviroment where I want as a few connections as possible and do not care about monitoring killing both would be great.

  4. Four (4) of the connections follow this pattern where they check for the existance of some queues/sets/keys, then set a value in the set. But still aren't closed. Why? For example:

    1449649220.026758 [0 [::1]:57605] "INFO"
    1449649220.027633 [0 [::1]:57605] "MULTI"
    1449649220.027655 [0 [::1]:57605] "LLEN" "celery@xxxxxx-MacBook-Pro.local.celery.pidbox"
    1449649220.027665 [0 [::1]:57605] "LLEN" "celery@xxxxxx-MacBook-Pro.local.celery.pidbox\x06\x163"
    1449649220.027674 [0 [::1]:57605] "LLEN" "celery@xxxxxx-MacBook-Pro.local.celery.pidbox\x06\x166"
    1449649220.027681 [0 [::1]:57605] "LLEN" "celery@xxxxxx-MacBook-Pro.local.celery.pidbox\x06\x169"
    1449649220.027691 [0 [::1]:57605] "EXEC"
    1449649220.027983 [0 [::1]:57605] "SADD" "_kombu.binding.celery.pidbox" "\x06\x16\x06\x16celery@xxxxxx-MacBook-Pro.local.celery.pidbox"
    
  5. One (1) connection seems to be used to boot strap the system. Setting up keys related to a pidbox then publishing. If this just to let the distributed system know that a new worker has come online?

    1449697220.549016 [0 [::1]:62992] "PUBLISH" "celery.pidbox" "{\"body\": \"eyJyZXBseV90byI6IHsicm91dGluZ19rZXkiOiAiMTUwYWZhYzEtZThmNy0zNDI2LWEwM2ItNWRhNGYzMzg3M2JhIiwgImV4Y2hhbmdlIjogInJlcGx5LmNlbGVyeS5waWRib3gifSwgInRpY2tldCI6ICJjNGUyNTVjMS05YzZjLTQxNzktOGM4Yi05NzRmOGVjYmE5ZDQiLCAiZGVzdGluYXRpb24iOiBudWxsLCAibWV0aG9kIjogImhlbGxvIiwgImFyZ3VtZW50cyI6IHsicmV2b2tlZCI6IHt9LCAiZnJvbV9ub2RlIjogImNlbGVyeUBFc3RldmFucy1NYWNCb29rLVByby5sb2NhbCJ9fQ==\", \"headers\": {\"expires\": 1449697221.548759, \"clock\": 1}, \"content-type\": \"application/json\", \"properties\": {\"body_encoding\": \"base64\", \"delivery_info\": {\"priority\": 0, \"routing_key\": \"\", \"exchange\": \"celery.pidbox\"}, \"delivery_mode\": 2, \"delivery_tag\": \"e8e4ad76-bb0a-4c83-8cca-d01e25f3633b\"}, \"content-encoding\": \"utf-8\"}"
    
  6. The keys set in redis for Kombu are pretty cryptic and I'm having a hard time finding out exactly what they are used for. I assume these key/values are the actual message queues being consumed but what specific purpose does _kombu.binding.celery.pidbox server, for example, and what is with the crazy formatting for the value? (I assume the _kombu.binding.celery is the default task queue and _kombu.binding.celeryev is the queue for heartbeat messages)

    1449649219.005095 [0 [::1]:57599] "SADD" "_kombu.binding.reply.celery.pidbox" "bc8319b5-c8d3-38b9-8848-da686bd088b7\x06\x16\x06\x16bc8319b5-c8d3-38b9-8848-da686bd088b7.reply.celery.pidbox"
    1449649220.020213 [0 [::1]:57604] "SADD" "_kombu.binding.celeryev" "worker.#\x06\x16\x06\x16celeryev.4834be60-b102-4fd5-9fdc-617bb945c079"
    1449649220.024899 [0 [::1]:57603] "SADD" "_kombu.binding.celery" "celery\x06\x16\x06\x16celery"
    1449649220.027983 [0 [::1]:57605] "SADD" "_kombu.binding.celery.pidbox" "\x06\x16\x06\x16celery@xxxxxx-MacBook-Pro.local.celery.pidbox"
    
  7. One (1) connection apears to just poll queues and popping elements off, this makes sense to keep alive and I can see why this would never close. But where are these queue names coming from? They do not appear to be the same names as those set above using SADD. Why are their 3 queues?

    1449649224.677975 [0 [::1]:57601] "BRPOP" "celery" "celery\x06\x163" "celery\x06\x166" "celery\x06\x169" "1"
    

If, in the end, a single worker just needs 8 connection because all of this is necessary then so be it.

10cool
  • 131
  • 1
  • 7

0 Answers0