3

laravel 5.4
php 7.1.32
supervisor 3.3.1
(I know... I know. The company is running 3 years behind)

config/queue.php

'database' => [
    'driver' => 'database',
    'connection' => 'queue', // means in config/database.php I have to setup a new connection
    'table' => 'jobs',
    'queue' => 'default',
    'retry_after' => 90,
],

supervisor configurations

[program:laravel-worker]
process_name=%(program_name)s_%(process_num)02d
command=php /var/www/html/project/artisan queue:work
autostart=true
autorestart=true
user=root
numprocs=1
redirect_stderr=true
stdout_logfile=/var/www/html/project/storage/logs/supervisord.log


SCENARIO: When a queue is dispatched, a new row gets added to the jobs table. Then a Laravel Worker is assigned to a job, that's when the reserved_at & attempts columns are updated. Once the job is finished, the row (job) gets deleted.

Now, for some mystical reasons, sometimes a job works seamlessly & sometime a job is there but no worker is assigned to it and within a split second that job is deleted. What just happened? Also, this is so intermittent that it's hard to track which jobs were actually attempted and later removed or just removed by this issue.

I've been scratching my head for at least a week now, please guide me.


Edit
- php artisan queue:restart broadcast, the supervisor updates with the up-time but no improvement on the issue.
- Later on, I removed the supervisor and tried doing the manual way php artisan queue:work and even that removed the job instead of attempting it (No new data inserted in failed_jobs either).


Edit (2) - Logs
Here's the function that dispatches the job

public static function sendVoiceCall($instituteSmsSettings, $announcementId, $postUrl, $mobileNumbers)
    {
        Log::debug('Job dispatched');
        dispatch(new SendVoiceCall($instituteSmsSettings, $announcementId, $postUrl, $mobileNumbers));
    }

Here's the Job

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Queue\SerializesModels;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Support\Facades\Log;

class SendVoiceCall implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public $tries = 3;
    protected $ARR_POST_DATA = array();
    protected $postUrl = '';

    public function __construct($instituteSmsSettings, $announcementId, $postUrl, $mobileNumbers)
    {
        $this->ARR_POST_DATA['username'] = $instituteSmsSettings->username;
        $this->ARR_POST_DATA['token'] = $instituteSmsSettings->token;
        $this->ARR_POST_DATA['announcement_id'] = $announcementId;
        $this->ARR_POST_DATA['plan_id'] = $instituteSmsSettings->plan_id;
        $this->ARR_POST_DATA['caller_id'] = $instituteSmsSettings->caller_id;
        $this->ARR_POST_DATA['contact_numbers'] = $mobileNumbers;
        $this->postUrl = $postUrl;
    }

    public function handle()
    {
        Log::debug('Job started');
        $curl = curl_init();
        curl_setopt($curl, CURLOPT_URL, $this->postUrl);
        curl_setopt($curl, CURLOPT_POST, 1);
        curl_setopt($curl, CURLOPT_POSTFIELDS, http_build_query($this->ARR_POST_DATA));
        curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);

        $sendCallResponse = curl_exec($curl);
        $curlResponse = json_decode($sendCallResponse, true);
        curl_close($curl);

        Log::info('Queue complete. Announcement Id: ' . ($this->ARR_POST_DATA['announcement_id'] ?? ''));
    }
}

Here's the Log

[2020-04-13 10:17:49] production.DEBUG: Job dispatched  
[2020-04-13 10:17:50] production.DEBUG: Job started  
[2020-04-13 10:17:50] production.INFO: Queue complete. Announcement Id: 203691 

As you can see, the queue started and finished at the same time (there should have been a 2-3 sec difference. The worker is deleting the job rather than attempting it.

Digvijay
  • 7,836
  • 3
  • 32
  • 53
  • 2
    Do you have a failed_jobs table? If yes, does it get populated with these missing jobs? Is there any case that the code itself produces an error that is somehow swallowed, for example by a try-catch block, where catch does nothing? – George Koniaris Apr 09 '20 at 16:54
  • @GeorgeKoniaris The failed_jobs table is empty, even when the missing jobs gets deleted. I have try catch block added to my code but no error is thrown. I have also added logs for debugging purpose but the code works fine since the dispatch() code adds to the jobs table. No error thrown in the supervisor logs as well. – Digvijay Apr 10 '20 at 04:41
  • Could you try to disable `autorestart` and perform it again? I'm quite curious why there is no logging in your `failed_jobs` table nor in the supervisor logfile. My first guess segfault issues. Happens to me very rarely but sadly it happens from time to time. Especially since you're using an old L5 version with PHP7+. Are you able to reproduce it locally with the exact PHP version? – Christoph Kluge Apr 10 '20 at 09:22
  • 1
    You should add logs to your Job class to see what happens. – Olivier Apr 10 '20 at 10:08
  • and the job code and logic would be nice – mrhn Apr 10 '20 at 17:28
  • 1
    Just to re-iterate the suggestion already made, as you haven't definitively ruled it out yet here - the simplest explanation is that the job actually does run, does not produce any output or any error, and completes normally. Are you 100% sure that this is not happening? Add logging as the very first thing the code does so you can see if it even starts. One thing I notice is you have `user=root`, that would usually be the user your web server runs as, is root really correct? Maybe some output file or logs getting created as root in some cases, causing write problems later? – Don't Panic Apr 11 '20 at 03:33
  • You should add job notifications and check what they report on peculiar jobs (or logging inside the job for that matter) you’re having trouble with, as you haven’t shown us any examples of logs where such behavior can be seen and tracked. Without additional investigation via logging it is difficult to guess what is at fault. – Daniel Protopopov Apr 13 '20 at 20:25
  • After Oliver's, Don't Panic's and @DanielProtopopov comments, I have added Edit (2) Logs for better understanding of the issue. Thanks. – Digvijay Apr 14 '20 at 04:19
  • Have you checked and ensured that there is only one `queue:listen` doing the processing? There are a number of posts like [this](https://stackoverflow.com/questions/26834623/why-is-laravel-or-beanstalkd-skipping-jobs/26973105) and [this](https://stackoverflow.com/questions/26834623/why-is-laravel-or-beanstalkd-skipping-jobs) that suggest why this may be happening. It seems you utilize different environments, so these may very well apply to the case. – Daniel Protopopov Apr 14 '20 at 04:59

1 Answers1

3

Your job is making a curl request, but does no error handling or checking of that call. The request could be failing for some reason, but this job will never tell you. Particularly now, when there is an unexplained problem, it makes sense to do some basic error checking of the request and the response.

You can check several things. Probably no need to do all these once you have solved this particular problem, but you should probably use at least one of them going forward to keep tabs on this call.

  • Use curl_error() to check if the call worked;

  • Use curl_getinfo() to show you details of the network request and response. This will give you lots of info, including http_response, you could test that it was 200 or whatever you expect;

  • If the remote server responds with some message - presumably it is since you are json_decode()ing it - log it. If there is some kind of success/failure element to the response, test it, make sure it is what you expect;

For example (as mentioned you would not typically do all of these, just choose which suits your situation):

$curl = curl_init();
// ... your code

// Get request info as an array
$info = curl_getinfo($curl);

$sendCallResponse = curl_exec($curl);
curl_close($curl);

// Log the actual response from the remote server
Log:info('Curl response: ' . $sendCallResponse);

// Log extensive info about the request/response, good when troubleshooting
Log:info('Curl info: ' . print_r($info, true));

// Maybe check the http response code
if ($info['http_code'] !== 200) {
    // handle an error, send a notification, etc
    Log:info('Whoah, remote server responded with ' . $info['http_code']);
}

// Maybe check if curl returned an error
if ($sendCallResponse === false) {
    // handle an error, send a notification, etc
    Log:info('Oh noes, curl error! ' . curl_error($curl));
}

// Maybe test the actual response, if there is something testable in it, 
// say a "status":
if ($curlResponse['status'] !== 'OK') {
    // handle an error, send a notification, etc
    Log:info('Remote returned status ' . $curlResponse['status']);
}
Don't Panic
  • 13,965
  • 5
  • 32
  • 51
  • You are right. There was a curl error in the response. Thank @Don't Panic for guiding me and helping me understand basic error checking of the request and the response. The bounty is yours. Thank you. – Digvijay Apr 14 '20 at 11:45