3

I deployed celery for some tasks that need to be performed at my workplace. These tasks are huge and I bought a few high-spec machines for performing these. Before I detail my issue, let me brief about what all I've deployed:

  • RabbitMQ broker on a remote server
  • Producer that pushes tasks on another remote server
  • Workers at 3 machines deployed at my workplace

Now, when I started the whole process was as smooth as I tested and everything process just great!

The problem

Unfortunately, I forgot to consult my network guy about a fixed IP address, and as per our location, we do not have a fixed IP address from our ISP. So my celery workers upon network disconnect freeze and do nothing. Even when the network is running, because the IP Address changed, and the connection to the broker is not being recreated or worker is not retrying connection. I have tried configuration like BROKER_CONNECTION_MAX_RETRIES = 0 and BROKER_HEARTBEAT = 10. But I had no option but to post it out here and look for experts on this matter!

PS: I cannot restart the workers manually everytime the network changes the IP address by kill -9

Sarthak Sawhney
  • 432
  • 4
  • 14

2 Answers2

0

Restarting the app using:

sudo rabbitmqctl stop_app
sudo rabbitmqctl start_app

solved the issue for me. Also, since I had virtual host setup, I needed to get that reset too. Not sure why was that needed. Or in fact any of the above was needed, but it did solve the problem for me.

qre0ct
  • 5,680
  • 10
  • 50
  • 86
0

The issue was because I was unable to understand the nature of AMQP protocol or RabbitMQ.

When a celery worker starts it opens up a channel at RabbitMQ. This channel upon any network changes tries to reconnect, but the port/sock opened for the channel previously is registered with a different public IP address of the client. As such the negotiations between the celery worker (client) and RabbitMQ (server) cannot resume because the client has changed the address, hence a new channel needs to be established in case of a change in the public IP address of the client.

The answer by @qreOct above is due to either I was unable to express the question properly or because of the difference in our perceptions. Still thanks a lot for taking your time out!

Sarthak Sawhney
  • 432
  • 4
  • 14