3

I have very basic producer-consumer code written with pika framework in python. The problem is - consumer side runs too slow on messages in queue. I ran some tests and found out that i can speed up the workflow up to 27 times with multiprocessing. The problem is - I don't know what is the right way to add multiprocessing functionality to my code.

import pika
import json
from datetime import datetime
from functions import download_xmls


def callback(ch, method, properties, body):
    print('Got something')
    body = json.loads(body)
    type = body[-1]['Type']
    print('Object type in work currently ' + type)
    cnums = [x['cadnum'] for x in body[:-1]]
    print('Got {} cnums to work with'.format(len(cnums)))

    date_start = datetime.now()
    download_xmls(type,cnums)
    date_end = datetime.now()
    ch.basic_ack(delivery_tag=method.delivery_tag)
    print('Download complete in {} seconds'.format((date_end-date_start).total_seconds()))


def consume(queue_name = 'bot-test'):
    parameters = pika.URLParameters('server@address')
    connection = pika.BlockingConnection(parameters)
    channel = connection.channel()
    channel.queue_declare(queue=queue_name, durable=True)
    channel.basic_qos(prefetch_count=1)
    channel.basic_consume(callback, queue='bot-test')
    print(' [*] Waiting for messages. To exit press CTRL+C')
    channel.start_consuming()

How do I start with adding multiprocessing functionality from here?

  • Where do you spend most of the time? Waiting for your download? If that's the case, you are I/O bound and could probably do with threading rather than multiprocessing, or, if you fancy that, an asynchronouos approach, e.g. with `asyncio`. – JohanL Mar 27 '19 at 11:45

2 Answers2

2

Pika has extensive example code that I recommend you check out. Note that this code is for example use only. In the case of doing work on threads, you will have to use a more intelligent way to manage your threads.

The goal is to not block the thread that runs Pika's IO loop, and to call back into the IO loop correctly from your worker threads. That's why add_callback_threadsafe exists and is used in that code.


NOTE: the RabbitMQ team monitors the rabbitmq-users mailing list and only sometimes answers questions on StackOverflow.

Luke Bakken
  • 8,993
  • 2
  • 20
  • 33
0
import pika
import json
from multiprocessing import Process
from datetime import datetime
from functions import download_xmls
import multiprocessing
import concurrent.futures


def do_job(body):
    body = json.loads(body)
    type = body[-1]['Type']
    print('Object type in work currently ' + type)
    cnums = [x['cadnum'] for x in body[:-1]]
    print('Got {} cnums to work with'.format(len(cnums)))

    date_start = datetime.now()
    download_xmls(type,cnums)
    date_end = datetime.now()
    ch.basic_ack(delivery_tag=method.delivery_tag)
    print('Download complete in {} seconds'.format((date_end-date_start).total_seconds()))

def callback(ch, method, properties, body):
    print('Got something')
    p = Process(target=do_job,args=(body))
    p.start()
    p.join()
    
def consume(queue_name = 'bot-test'):
    parameters = pika.URLParameters('server@address')
    connection = pika.BlockingConnection(parameters)
    channel = connection.channel()
    channel.queue_declare(queue=queue_name, durable=True)
    channel.basic_qos(prefetch_count=1)
    channel.basic_consume(callback, queue='bot-test')
    print(' [*] Waiting for messages. To exit press CTRL+C')
    channel.start_consuming()

def get_workers():
    try:
        return multiprocessing.cpu_count()
    except NotImplementedError:
        return 4

workers = get_workers()

with concurrent.futures.ProcessPoolExecutor() as executor:
    for i in range(workers):
        executor.submit(consume)

Above is just simple demo how you can include multiprocessing execution here. I recommend you to go through documentation to further optimise the code and achieve what you require.

https://docs.python.org/3/library/multiprocessing.html#the-process-class

Rahul Jain
  • 3,065
  • 1
  • 13
  • 17