0

we have 60k customers we need to renew their subscription, for that, we use the renewal script which uses curl_multi_exec to create children to call another script which do the transactions and insert is_paid=1 in the database.

The script runs using a cron job, on PHP 5.3 and Apache, centos 6 server. The script always take between 30 to 50 minutes to finish. The script below creates children which run another script which call an API using curl and the API calls a payment gateway using curl and update the database.

If $number_of_items = 400; Of something like that, all the curl request will be done but the cpu will be so high that MySQL will stop inserting records in the database. So you pay but is_paid=0, MySQL won't update it, so around 2% of the users won't have access to the content they paid for

If the number_of_items=6000, it means fewer curl threads, and more number of items per thread, cpu is great, no problem at all, but not all people get their subscription renewed. if we have 60k customers, 10 curl threads get created successfully but maybe 1 finish half its job, we end up with 8k customers renewed or something like that, so they don't even call the payment API

<?php
ini_set('max_execution_time', 0);
ini_set('memory_limit', '-1');
ini_set('display_errors', 'Off');
ini_set("default_socket_timeout", -1);
$filename = __DIR__ . "/subscriptionRenewalAPI.lock";
$lifelimit = 86400; // in Second lifetime to prevent errors
/* check lifetime of file if exist */
if (file_exists($filename)) {
    $lifetime = time() - filemtime($filename);
} else {
    $lifetime = 0;
}

/* check if file exist or if file is too old */
if (!file_exists($filename) || $lifetime > $lifelimit) {
    if ($lifetime > $lifelimit) {
        unlink($filename); //Suppress if exist and too old
    }

    $file = fopen($filename, "w+"); // Create lockfile

    if ($file == false) {
        die("file didn't create, check permissions");
    }

    //Fetch orders that expires and needs renewal TODAY
    //=========================================================================Fetch orders that needs renewal (getOrdersForRenewal) ====================================\\
    // create a new cURL resource
    $ch = array();
    $ch2 = curl_init();
    $mh = curl_multi_init();


    // set URL and other appropriate options
    curl_setopt($ch2, CURLOPT_URL, "http://localhost/crudAPI_mobile/getOrdersForRenewal?auth=staticAuthApli");
    curl_setopt($ch2, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch2, CURLOPT_HEADER, 0);
    if ($response = curl_exec($ch2)) {

        if ($response != -1) {
            $orders_for_renewal = json_decode($response, true);


            $begining = 0;
            $number_of_items = 400;

            $curl_counter = 0;

            while ($begining < sizeof($orders_for_renewal)) {  //For each batch -> open a new curl

                $output = array_slice($orders_for_renewal, $begining, $number_of_items);


                //Open a new curl channel
                $ch[$curl_counter] = curl_init();

                //Time to iterate and renew
                $data = array('output' => $output);

                $data = http_build_query($data);
                curl_setopt($ch[$curl_counter], CURLOPT_URL, "http://localhost/crons/subscriptionRenewalAPIExecution.php");
                curl_setopt($ch[$curl_counter], CURLOPT_POST, true);
                curl_setopt($ch[$curl_counter], CURLOPT_POSTFIELDS, $data);
                curl_setopt($ch[$curl_counter], CURLOPT_HEADER, 0);
                curl_setopt($ch[$curl_counter], CURLOPT_RETURNTRANSFER, true);
                curl_multi_add_handle($mh, $ch[$curl_counter]); //Add it to the multi curl handler

                $begining += $number_of_items;
                $curl_counter++;
            } //end of while batch

            $renewed_orders = 0;
            $churned_orders = 0;

            //Execute the conccurent curl calls prepared
            $active = null;
            $do_while_counter = 0;
            do {

                $mrc = curl_multi_exec($mh, $active);
                curl_multi_select($mh);
                $do_while_counter++;
                sleep(1); // Maybe needed to limit CPU load (See P.S.)
            } while ($active);
            $content = array();
            $i = 0;
            $number_of_curl_requests = 0;
            foreach ($ch as $i => $c) {
                $content[$i] = curl_multi_getcontent($c);
                $content[$i] = unserialize($content[$i]);
                $number_of_curl_requests++;

                curl_multi_remove_handle($mh, $c);
            }
        } 

    }
    // close cURL resource, and free up system resources
    curl_multi_close($mh);
    curl_close($ch2);
    unlink($filename); //Suppress lock file after your process
} else {
    exit(); // Process already in progress
}

I want to keep a relatively big batch of data to ensure that the CPU won't be crazy, and I want to ensure that each child isn't timing out.

Lynob
  • 5,059
  • 15
  • 64
  • 114
  • Have you checked what attributes to high CPU usage? Perhaps you should look into reusing [CURL handles](https://stackoverflow.com/questions/3787002/reusing-the-same-curl-handle-big-performance-increase)? – Daniel Protopopov Apr 22 '20 at 18:59
  • @DanielProtopopov, how to correctly find what attributes to the high cpu usage? we were debating between some bad SQL queries and the high number of Curl threads. Personally I vote for the latter, because when the cpu gets high, I restart apache and everything works fine, we're using apache preform. If it was MySQL problem, I'd have to restart MySQL to lower the cpu, Also, update the update query after the payment gateway is called is failing, so all the select queries before that are doing fine. – Lynob Apr 22 '20 at 19:07
  • @DanielProtopopov I've not looked into reusing handles, that question is about performance, not sure if that lowers the CPU usage too, would it? – Lynob Apr 22 '20 at 19:08
  • What about re-writing your logic to split the work up over time, i.e. 4 cron jobs staggered over the day that each work on 15,000 of the accounts? – Dave S Apr 22 '20 at 20:14
  • @Lynob that is the question you should direct to your system administrator, who, through analysis of system’ performance will be able to tell you the most probable cause. Otherwise you are better off by following Dave’ advice and split your processing over the matter of hours, instead of trying to squeeze the most out of your server all at once and trying to optimize it. That may indeed be the most efficient way ;) – Daniel Protopopov Apr 22 '20 at 20:26

0 Answers0