18

Ok here is an overview of what's going on:

    M <-- Message with unique id of 1234
    |
    +-Start Queue
    |
    |
    | <-- Exchange
   /|\
  / | \
 /  |  \ <-- bind to multiple queues
Q1  Q2  Q3
\   |   / <-- start of the problem is here
 \  |  / 
  \ | /
   \|/
    |
    Q4 <-- Queues 1,2 and 3 must finish first before Queue 4 can start
    |
    C <-- Consumer 

So I have an exchange that pushes to multiple queues, each queue has a task, once all tasks are completed, only then can Queue 4 start.

So message with unique id of 1234 gets sent to the exchange, the exchange routes it to all the task queues ( Q1, Q2, Q3, etc... ), when all the tasks for message id 1234 have completed, run Q4 for message id 1234.

How can I implement this?

Using Symfony2, RabbitMQBundle and RabbitMQ 3.x

Resources:

UPDATE #1

Ok I think this is what I'm looking for:

RPC with Parallel Processing, but how do I set the Correlation Id to be my unique id to group the messages and also identify what queue?

Phill Pafford
  • 83,471
  • 91
  • 263
  • 383
  • So, if I'm understanding correctly, you have a 4th queue that can only start to be processed when 3 other queues are empty? If you are processing a lot of things in parallel, won't your 3 queues always be passing information? – afuzzyllama Dec 13 '12 at 14:27
  • yes as the queues will always have new messages, forgot to mention that all the data in each queue is related by a unique id. so the exchange sends the unique id 1234 to Q1, Q2 and Q3. Each queue performs a different task. In Q4 I need to know when then messages with the unique id of 1234 in Q1, Q2 and Q3 are finished before I can process the message in Q4. Updated my question – Phill Pafford Dec 13 '12 at 14:32

5 Answers5

7

You need to implement this: http://www.eaipatterns.com/Aggregator.html but the RabbitMQBundle for Symfony doesn't support that so you would have to use the underlying php-amqplib.

A normal consumer callback from the bundle will get an AMQPMessage. From there you can access the channel and manually publish to whatever exchanges comes next in your "pipes and filters" implementation

old_sound
  • 2,243
  • 1
  • 13
  • 16
5

In the RPC tutorial at RabbitMQ's site, there is a way to pass around a 'Correlation id' that can identify your messages to users in the queue.

I'd recommend using some sort of id with your messages into the first 3 queues and then have another process to dequeue messages from the 3 into buckets of some sort. When those buckets receive what I'm assuming is the completion of there 3 tasks, send the final message off to the 4th queue for processing.

If you are sending more than 1 work item to each queue for one user, you might have to do a little preprocessing to find out how many items a particular user placed into the queue so the process dequeuing before 4 knows how many to expect before queuing up.


I do my rabbitmq in C#, so sorry my pseudo code isn't in php style

// Client
byte[] body = new byte[size];
body[0] = uniqueUserId;
body[1] = howManyWorkItems;
body[2] = command;

// Setup your body here

Queue(body)

// Server
// Process queue 1, 2, 3
Dequeue(message)

switch(message.body[2])
{
    // process however you see fit
}

processedMessages[message.body[0]]++;

if(processedMessages[message.body[0]] == message.body[1])
{
    // Send to queue 4
    Queue(newMessage)
}

Response to Update #1

Instead of thinking of your client as a terminal, it might be useful to think of the client as a process on a server. So if you setup an RPC client on a server like this one, then all you need to do is have the server handle the generation of a unique id of a user and send the messages to the appropriate queues:

    public function call($uniqueUserId, $workItem) {
        $this->response = null;
        $this->corr_id = uniqid();

        $msg = new AMQPMessage(
            serialize(array($uniqueUserId, $workItem)),
            array('correlation_id' => $this->corr_id,
            'reply_to' => $this->callback_queue)
        );

        $this->channel->basic_publish($msg, '', 'rpc_queue');
        while(!$this->response) {
            $this->channel->wait();
        }

        // We assume that in the response we will get our id back
        return deserialize($this->response);
    }


$rpc = new Rpc();

// Get unique user information and work items here

// Pass even more information in here, like what queue to use or you could even loop over this to send all the work items to the queues they need.
$response = rpc->call($uniqueUserId, $workItem);

$responseBuckets[array[0]]++;

// Just like above code that sees if a bucket is full or not
afuzzyllama
  • 6,538
  • 5
  • 47
  • 64
  • Could you explain a little more? I understand the RPC part but you're saying add the RPC to Q1, Q2 and Q3? – Phill Pafford Dec 13 '12 at 14:39
  • Instead of using the id for an RPC, use it to group messages for the process sitting in front of your 4th queue. You probably don't even need to use that id. You could probably embed a user id into the body of your message. – afuzzyllama Dec 13 '12 at 14:43
  • What do you mean by 'Group Messages'? I think I'm getting the concept but need a little more details – Phill Pafford Dec 13 '12 at 14:46
  • I tried to write a simple example in my questions. Of course there is a lot to be filled in, but I think it shows the route I would try to take? – afuzzyllama Dec 13 '12 at 14:58
  • I have updated my question, I think I'm looking for RPC with Parallel Processing. Would you mind looking at it again? +1 for the efforts – Phill Pafford Dec 13 '12 at 17:39
2

I am a little unclear on what you are trying to achieve here. But I would probably alter the design somewhat so that once all messages are cleared from the queues you publish to a separate exchange which publishes to queue 4.

robthewolf
  • 7,343
  • 3
  • 29
  • 29
2

In addition to my RPC based answer I want to add another one which is based on EIP aggregator pattern.

The idea is next: Everything is async, no RPC or other sync things. Every task sends an even when it is done, The aggregator is subscribed to that event. It basically counts tasks and sends task4 message when the counter reaches expected number (in our case 3). I choose a filesystem as a storage for counters for the Sake of simplicity. You can use a database there.

The producer looks simpler. It just fires and forgets

<?php
use Enqueue\Client\Message;
use Enqueue\Client\ProducerInterface;
use Enqueue\Util\UUID;
use Symfony\Component\DependencyInjection\ContainerInterface;

/** @var ContainerInterface $container */

/** @var ProducerInterface $producer */
$producer = $container->get('enqueue.client.producer');

$message = new Message('the task data');
$message->setCorrelationId(UUID::generate());

$producer->sendCommand('task1', clone $message);
$producer->sendCommand('task2', clone $message);
$producer->sendCommand('task3', clone $message);

The task processor has to send an event once its job is done:

<?php
use Enqueue\Client\CommandSubscriberInterface;
use Enqueue\Client\Message;
use Enqueue\Client\ProducerInterface;
use Enqueue\Psr\PsrContext;
use Enqueue\Psr\PsrMessage;
use Enqueue\Psr\PsrProcessor;

class Task1Processor implements PsrProcessor, CommandSubscriberInterface
{
    private $producer;

    public function __construct(ProducerInterface $producer)
    {
        $this->producer = $producer;
    }

    public function process(PsrMessage $message, PsrContext $context)
    {
        // do the job

        // same for other
        $eventMessage = new Message('the event data');
        $eventMessage->setCorrelationId($message->getCorrelationId());

        $this->producer->sendEvent('task_is_done', $eventMessage);

        return self::ACK;
    }

    public static function getSubscribedCommand()
    {
        return 'task1';
    }
}

And the aggregator processor:

<?php

use Enqueue\Client\TopicSubscriberInterface;
use Enqueue\Psr\PsrContext;
use Enqueue\Psr\PsrMessage;
use Enqueue\Psr\PsrProcessor;
use Symfony\Component\Filesystem\LockHandler;

class AggregatorProcessor implements PsrProcessor, TopicSubscriberInterface
{
    private $producer;
    private $rootDir;

    /**
     * @param ProducerInterface $producer
     * @param string $rootDir
     */
    public function __construct(ProducerInterface $producer, $rootDir)
    {
        $this->producer = $producer;
        $this->rootDir = $rootDir;
    }

    public function process(PsrMessage $message, PsrContext $context)
    {
        $expectedNumberOfTasks = 3;

        if (false == $cId = $message->getCorrelationId()) {
            return self::REJECT;
        }

        try {
            $lockHandler = new LockHandler($cId, $this->rootDir.'/var/tasks');
            $lockHandler->lock(true);

            $currentNumberOfProcessedTasks = 0;
            if (file_exists($this->rootDir.'/var/tasks/'.$cId)) {
                $currentNumberOfProcessedTasks = file_get_contents($this->rootDir.'/var/tasks/'.$cId);

                if ($currentNumberOfProcessedTasks +1 == $expectedNumberOfTasks) {
                    unlink($this->rootDir.'/var/tasks/'.$cId);

                    $this->producer->sendCommand('task4', 'the task data');

                    return self::ACK;
                }
            }

            file_put_contents($this->rootDir.'/var/tasks/'.$cId, ++$currentNumberOfProcessedTasks);

            return self::ACK;
        } finally {
            $lockHandler->release();
        }
    }

    public static function getSubscribedTopics()
    {
        return 'task_is_done';
    }
}
Maksim Kotlyar
  • 3,821
  • 27
  • 31
0

I can show you how you can do it with enqueue-bundle.

So install it with composer and register as any other bundle. Then configure:

// app/config/config.yml

enqueue:
  transport:
    default: 'amnqp://'
  client: ~

This approach is based on RPC. Here's how you do it:

<?php
use Enqueue\Client\ProducerInterface;
use Symfony\Component\DependencyInjection\ContainerInterface;

/** @var ContainerInterface $container */

/** @var ProducerInterface $producer */
$producer = $container->get('enqueue.client.producer');

$promises = new SplObjectStorage();

$promises->attach($producer->sendCommand('task1', 'the task data', true));
$promises->attach($producer->sendCommand('task2', 'the task data', true));
$promises->attach($producer->sendCommand('task3', 'the task data', true));

while (count($promises)) {
    foreach ($promises as $promise) {
        if ($replyMessage = $promise->receiveNoWait()) {
            // you may want to check the response here
            $promises->detach($promise);
        }
    }
}

$producer->sendCommand('task4', 'the task data');

The consumer processor looks like this:

use Enqueue\Client\CommandSubscriberInterface;
use Enqueue\Consumption\Result;
use Enqueue\Psr\PsrContext;
use Enqueue\Psr\PsrMessage;
use Enqueue\Psr\PsrProcessor;

class Task1Processor implements PsrProcessor, CommandSubscriberInterface
{
    public function process(PsrMessage $message, PsrContext $context)
    {
        // do task job

        return Result::reply($context->createMessage('the reply data'));
    }

    public static function getSubscribedCommand()
    {
        // you can simply return 'task1'; if you do not need a custom queue, and you are fine to use what enqueue chooses. 

        return [
          'processorName' => 'task1',
          'queueName' => 'Q1',
          'queueNameHardcoded' => true,
          'exclusive' => true,
        ];
    }
}

Add it to your container as a service with a tag enqueue.client.processor and run command bin/console enqueue:consume --setup-broker -vvv

Here's the plain PHP version.

Maksim Kotlyar
  • 3,821
  • 27
  • 31