0

I'm in a webhosting shared-account and the max_execution_time is 120 default. i set it on set_time_limit(0) and nothing happens in the phpinfo(). Is there a way to prevent or stop the gateway timeout error before it appears? Like stop the process right away before the 504 gateway error popups. I'm fetching icons in 50 to 100 urls. Here is my line of code.

<?php 
set_time_limit(0);
error_reporting(E_ERROR | E_PARSE);
include_once('../simplehtmldom_1_9_1/simple_html_dom.php');
function get_domain($url)
    {
      $pieces = parse_url($url);
      $domain = isset($pieces['host']) ? $pieces['host'] : '';
      if (preg_match('/(?P<domain>[a-z0-9][a-z0-9\-]{1,63}\.[a-z\.]{2,6})$/i', $domain, $regs)) {
        return $regs['domain'];
      }
      return false;
    }
function getfavicon($filename) {

      libxml_use_internal_errors(false);
      header('Content-type: text/html; charset=utf-8');
        $file = file_get_contents($filename);       
        $dom = new DOMDocument;
        $fav = array();
        @$dom->loadHTML($file);
        foreach($dom->getElementsByTagName('link') as $lnk) 
        {
            if(($lnk->getAttribute("rel") == "icon")||($lnk->getAttribute("rel") == "shortcut icon"))
              {
                $fav[] = $lnk->getAttribute("href");
              }
        }
        return $fav;
    }
$servername = "localhost:3306";
$username = "topswis7_user";
$password = "J75vsHs8p6";
$database = "topswis7_scrape";
$conn = new mysqli($servername, $username, $password, $database);
   $site = $_POST['favurl'];
   $ids = explode("\n", str_replace("\r", "", $site));
   $chunk = array_chunk($ids, 100);
   $count = count($chunk);
   $ficon = array();
   foreach($ids as $key=>$value){
    $result = parse_url($value);
    $url = $result['scheme']."://".$result['host'];
        // $url = 'http://example.com/';
         $ch  = curl_init($url);
                    // curl_setopt($ch, CURLOPT_URL, $url);
                    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
                    curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
                    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, FALSE);
                    curl_setopt($ch, CURLOPT_HEADER, true);
                    curl_setopt($ch, CURLOPT_COOKIEJAR, '-');
                    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10);
                    curl_setopt($ch, CURLOPT_TIMEOUT, 10);
                    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
                    curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
        $content  = curl_exec($ch);
        curl_close($ch);
                // $datenbank = "proxy_work.php"; 
                // $datei = fopen($datenbank,"w+");
                // fwrite($datei, $content);  
                // fwrite ($datei,"\r\n");
                // fclose($datei);
        // echo $content;
        $html = getfavicon($url);
        $http = 'http';
        // var_dump($html);
        foreach($html as $key => $value){
          $favinfo = pathinfo($value);
          $check = strpos($value, $http);
          // var_dump($favinfo)."<br >";
          switch($favinfo['extension']){
            case "ico";
              if($check === false){
              $value = $url."".$value;
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);

              }else{
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);
              }
              break 2;
            case "png";
              if($check === false){
              $value = $url."".$value;
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);

              }else{
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);
              }
              break 2;
            case "jpg";
              if($check === false){
              $value = $url."".$value;
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);

              }else{
              $sql = "INSERT INTO tbl_favicon(faviconLink)
                              VALUES ('".$value."')";
              $conn->query($sql);
              }
              break 2;
          }
        }
    }

                  $wordquery = "SELECT * FROM tbl_favicon";
                  $result = $conn->query($wordquery);
                  while($row = $result->fetch_assoc()) {
                    echo $row['faviconLink']."<br />";
                  }
  echo '<br /><a href="../index.php">Back Home</a><br />
            </body></html>';
  // var_dump($ficon);
?>
  • I insert the favi into DB to store the favi i scrape or fetch from websites. – Marc Justin Rait Jan 29 '20 at 10:14
  • Yes. i'm getting the favicon of multiple websites – Marc Justin Rait Jan 29 '20 at 10:15
  • First of all, such a gateway timeout is something different, than your actual PHP max execution time, see https://webmasters.stackexchange.com/a/119437 – 04FS Jan 29 '20 at 10:15
  • _“Like stop the process right away before the 504 gateway error popups.”_ - if you know the timeout value, then you could try and measure the time your script has already run, and then stop what you are doing when it gets “close” to that. – 04FS Jan 29 '20 at 10:16
  • Yes the timeout value suppose to be 120seconds but i try to measure the running time of the script it only takes 60seconds. – Marc Justin Rait Jan 29 '20 at 10:31
  • I tried to add ob_flush() and sleep() but nothing happens. – Marc Justin Rait Jan 29 '20 at 10:32
  • I expect to exceed the runtime script to 120 seconds atleast or until it finish processing all the url. – Marc Justin Rait Jan 29 '20 at 10:41
  • Again: PHP max execution time and gateway timeout are different things. The gateway timeout comes from the webserver or a proxy, waiting for a response from your PHP script. You can configure your PHP script to run for all eternity, that doesn’t matter if the gateway is only willing to wait for time span X. – 04FS Jan 29 '20 at 11:05
  • that doesn’t matter if the gateway is only willing to wait for time span X. I'm confuse right now. – Marc Justin Rait Jan 29 '20 at 11:15
  • Did you read the explanation I referred to above? – 04FS Jan 29 '20 at 11:21
  • Yes, if its different. how do i prevent the 504 gateway from popping up by limiting the runtimescript to 60secs and if the runtime reach 60sec i'll stop the process? – Marc Justin Rait Jan 29 '20 at 11:30
  • Yes, if you want to avoid the gateway timeout, then the gateway needs to get a proper response from the script before that time span has passed. And that means, that script needs to end in a regular fashion, it must not get killed off due to its own timeouts. – 04FS Jan 29 '20 at 11:53
  • any suggestion how can i make the gateway response from the script? – Marc Justin Rait Jan 29 '20 at 12:29
  • Other than making sure your script ends properly, there shouldn’t be much else you need to do in that regard. – 04FS Jan 29 '20 at 12:29
  • I don't have any idea how to do it, can you suggest example code? Thank you in advance. – Marc Justin Rait Jan 29 '20 at 12:35
  • 1
    Get the timestamp when your script starts, check the difference to the current timestamp inside your outer foreach loop. If that difference gets greater than whatever you figured would be a sensible limit - break out of the loop. (Test with smaller limits first maybe, then try and work your way up to larger ones when the basic implementation works as expected.) You will probably still have to figure out how to pick up where you left on the following runs though … – 04FS Jan 29 '20 at 12:45
  • I did what you instruct, i set the timer from the start of the process and set a limit like 250secs and stop the whole process but it ignores the limiter of 250 secs and still continues. – Marc Justin Rait Feb 03 '20 at 06:23

0 Answers0