Multiple queue listeners will run the same job over multiple processes

Question

I have a simple web application that is written with the Laravel 4.2 framework. I have configured the Laravel queue component to add new queue items to a locally running beastalkd server.

Essentially, there is a POST route that will add an item to the beanstalkd tube.

I then have supervisord set up to run artisan queue:listen as three separate processes. The issue that I am seeing is that the different queue:listen processes will end up spawning anywhere between one to three queue:worker processes for just one inserted job.

The end result being that one job inserted into the queue is sometimes being processed by multiple workers at the same time, something I am obviously trying to avoid.

The job code is relatively simple:

<?php

use App\Repositories\URLRepository;

class ProcessDataJob {
    private $urls;

    public function __construct(URLRepository $urls)
    {
        $this->urls = $urls;
    }

    public function fire($job, $data)
    {
        $input = $data['post'];

        if (!isset($data['post']) && !is_array($data['post'])) {
            Log::error('[Job #'.$job->getJobId().'] $input was empty inside CreateAuditJob. Deleting job. Quitting. Bye!');
            $job->delete();
            return false;
        }

        //
        // ... code that will take a few hours to run.
        //

        $job->delete();
        Log::info('[Job #'.$job->getJobId().'] ProcessDataJob was successful, deleting the job!');
        return true;
    }
}

The fun part being that most of the (duplicated) queue workers fail when deleting the job with this left in the errorlog:

exception 'Pheanstalk_Exception_ServerException' with message 'Job 3248 NOT_FOUND: does not exist or is not reserved by client'

The ttr (Time to Run) is set to 172,800 seconds (or 48 hours), which is much larger then the time it would take for the job to complete.

score 4 · Answer 1 · answered Nov 29 '14 at 06:55

4

what's the job time_to_run when queued? If running the job takes longer than time_to_run seconds, the job is automatically re-queued and becomes eligible to be run by the next worker.

A reserved job has time_to_run seconds to be deleted, released or touched. Calling touch() restarts the timeout timer, so workers can use it to give themselves more time to finish. Or use a large enough value when queueing.

I've found the beanstalkd protocol document helpful https://github.com/kr/beanstalkd/blob/master/doc/protocol.md

answered Nov 29 '14 at 06:55

Andras

2,995
11
17

ok, so much for the easy solution. Do the logs show that the dupliate jobs all start at the same time? Or are they staggered? It's possible there's some timeout in the app somewhere. Else you might have to dig into the dispatch code to track down what it's doing. I've written queuing systems (two in php, a third using beanstalk), but don't know laravel. – Andras Dec 01 '14 at 00:20
The duplicate jobs start __usually__ within 10 seconds of each other. There doesn't seem to be a timeout inside the app, especially with the beanstalk connection. No exceptions or errors are being thrown. I am thinking it might be a beanstalk bug. – Tim Groeneveld Dec 01 '14 at 00:22
10 seconds is a long time... how weird. Can you instrument the dispatch? just to log the job id returned by beanstalk reserve()? Bisection search, either narrow it to beanstalk or to the app. For timeout, I was thinking of a built-in "declare error after 10 minutes" configured max-run-time type of thing – Andras Dec 01 '14 at 00:29
Ah, no timeouts have been set. A telnet to the beanstalk server running reserve on two separate connections returns the same data, so I am fairly sure it's a beanstalkd issue. – Tim Groeneveld Dec 01 '14 at 01:12
yikes what version of beanstalk? – Andras Dec 01 '14 at 01:15
beanstalkd 1.9. v1.10 is the latest, but looking at the diff (https://github.com/kr/beanstalkd/compare/v1.9...v1.10) I can't see anything that would change the behaviour that I am seeing. – Tim Groeneveld Dec 01 '14 at 01:21
I could almost reproduce the telnet issue, then I found my bug... if you disconnect from breanstalk, it resets the reserved jobs. I was piping the beanstalk command into nc localhost 11300, which disconnected after the response. Running reserve again reserved the same job. Running reserve on two different connections, and keeping them connected, reserved two different jobs. So: is the app disconnecting after reserving jobs? – Andras Dec 01 '14 at 02:23
No, it's definitely not. You are correct, you need to stay connected- disconnecting from beanstalk will put the job back into the queue for available jobs. – Tim Groeneveld Dec 01 '14 at 02:56
Andras how did you resolve this? I'm having a similar problem but using Redis as the queue server? – Matt Humphrey Apr 27 '15 at 15:07
@MattHumphrey my bug was in the testing methodology -- I was disconnecting from beanstalkd, which reset the reserved tasks into an unreserved and available state, allowing them to be reserved again. From what little I know of Redis, it publishes a notification and the workers grab the tasks, but I don't know how mutual exclusion is done. – Andras Apr 29 '15 at 13:44
Yeh it was redis that was the issue. Switched to beanstalkd and had no issues! – Matt Humphrey May 12 '15 at 15:15

Reuel Ribeiro · Answer 2 · 2016-06-19T21:19:40.087

3

Since you are running with Laravel, verify your queue.php configuration file.

Change the ttr value from 60 (default) to something else that is better for you.

    'beanstalkd' => array(
        'driver' => 'beanstalkd',
        'host'   => 'localhost',
        'queue'  => 'default',
        'ttr'    => 600, //Example
    ),

edited Jun 19 '16 at 21:19

answered Sep 09 '15 at 22:41

Reuel Ribeiro

1,419
14
23

Multiple queue listeners will run the same job over multiple processes

2 Answers2