1

I'm using PHPExcel to create excel files using php. First take look at below code, the problem with this code is, After saving some amount of data into excel it only remains in process but doesn't saves anything. I think the script saves any data to catch and temporary values and this is causing the script to take more and more loading after each iteration. Explanation:

First of all this code retrieves some integer values from one excel file one by one viz. sample.xls (this files contains values in only column A.). suppose it has retrieved first value from cell A1 is 1212, then the code sets $target=1212, after the curl functions retrieves data for 1212 and saves as html in results folder as 1212.html. After the dom library starts their work. the 1212.html file contains table with three columns and so many rows. so dom grabs data for td and tr and saves respective values in excel cell and finally it saves data into excelresult folder as 1212.xlsx, and again the same process taken place for cell A2 in sample.xls, retrives some values like 1213, and starts grabbing and so on.

Problem:

Here it takes little time for first value like 1212, then takes little more time for second value 1213, and little more and after four or five values it takes so long (lots of minutes) time for execution, Please help me decrease this time, and make this process faster. Thanks.

code:

<?php
......
ini_set('include_path', ini_get('include_path').';../Classes/');
include_once 'PHPExcel.php';
include_once 'Excel2007.php';

$objPHPExcel = new PHPExcel();

$objPHPExcel->getProperties()->....//set some properties//
$excel->read('sample.xls'); // added excel reader from which we need to take some values   
        $x=1;
        while($x<=$excel->sheets[0]['numRows']) { // reading row by row 
          $y=1;
          while($y<=$excel->sheets[0]['numCols']) {// reading column by column 
            $cell = isset($excel->sheets[0]['cells'][$x][$y]) ? $excel->sheets[0]['cells'][$x][$y] : '';
            $target = $cell;


//  $objWorksheet = $objPHPExcel->getActiveSheet();
            //  $highestRow = $objWorksheet->getHighestRow();
            //  for($row=1; $row < $highestRow; ++$row){
   // $objPHPExcel->getActiveSheet()->removeRow($row,$row);
         // }
/* some lines of code using curl to fetch data for $target value
........... */
//below is the code which retrives data from html table and saves into excel file.
$url='results/'.$target.'.html';
include_once('dom.php');

$html=file_get_html($url);

    $record_find='first';

    foreach($html->find('table#GridView1') as $e){

                 if($record_find=='first')

                 $i=1;
                 $j=0;

                 foreach($e->find('tr') as $e1){

                                 $distno=trim($e1->find('td', 0)->innertext);
                                 $acno=trim($e1->find('td', 1)->innertext);
                                 $partno=trim($e1->find('td', 2)->innertext);
                 $objPHPExcel->setActiveSheetIndex(0);
                                $objPHPExcel->getActiveSheet()->SetCellValue('A'.$j, $distno);
                                $objPHPExcel->getActiveSheet()->SetCellValue('B'.$j, $acno);
                                $objPHPExcel->getActiveSheet()->SetCellValue('C'.$j, $partno);

                                 $j++;
                 }
    }

$objPHPExcel->getActiveSheet()->setTitle($target);

$objWriter = new PHPExcel_Writer_Excel2007($objPHPExcel); 
$objWriter->save('excelresult/'.$target.'.xlsx');

$y++;
          }
          $x++;
        } 
?>

Curl:

$debug = 1;
$url = "url";
$f = fopen('log.txt', 'w');
$cookies = 'cookies.txt';
touch($cookies);
$useragent = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/36.0.1985.125 Chrome/36.0.1985.125 Safari/537.36';


$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookies);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);

$html = curl_exec($ch);

curl_close($ch);

preg_match('~<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)" />~', $html, $viewstate);
preg_match('~<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)" />~', $html, $eventValidation);

$viewstate = $viewstate[1];
$eventValidation = $eventValidation[1];

$ch = curl_init();

curl_setopt($ch, CURLOPT_URL, $url);
//curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookies);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookies);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_STDERR, $f);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 0);
curl_setopt($ch, CURLOPT_TIMEOUT, 985000);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);

// Collecting all POST fields
$postfields = array();
$postfields['__EVENTTARGET'] = "";
$postfields['__EVENTARGUMENT'] = "";
$postfields['__LASTFOCUS'] = "";
$postfields['__VIEWSTATE'] = $viewstate;
$postfields['__EVENTVALIDATION'] = $eventValidation;
$postfields['cns_fer'] = 2;
$postfields['xttPd'] = $target;
$postfields['tsfDes'] = "Search";

curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
$ret = curl_exec($ch);
curl_close($ch);
file_put_contents('results/'.$target.'.html', $ret);

1 Answers1

0

Something like:

for($row=1; $row < $highestRow; ++$row){
    $objPHPExcel->getActiveSheet()->removeRow($row,$row);
}

Is going to take a lot of time with each iteration, and most of it redundant as well....

If you need to remove all data from the existing worksheet then use

    $objPHPExcel->getActiveSheet()->removeRow(1,$highestRow);

instead, no need to loop (especially spurious looping removing what has already been removed many times over), remove everything in a single call

I'm also puzzled why you're using one library for reading files (excel Reader) and another for writing (PHPExcel) when you could use a single library (PHPExcel) for both purposes.... although you seem to be doing very little with excel Reader because you're iterating over every cell in that spreadsheet and don't then seem to do anything with it at all

EDIT

What I meant by that last comment was something like:

<?php
......
ini_set('include_path', ini_get('include_path').';../Classes/');
include_once 'PHPExcel.php';
include_once 'Excel2007.php';

$excel->read('sample.xls'); // added excel reader from which we need to take some values   

$x=1;
while($x<=$excel->sheets[0]['numRows']) { // reading row by row 
    $y=1;
    while($y<=$excel->sheets[0]['numCols']) {// reading column by column 
        $cell = isset($excel->sheets[0]['cells'][$x][$y]) ? $excel->sheets[0]['cells'][$x][$y] : '';
        $target = $cell;

        /* some lines of code using curl to fetch data for $target value
            ........... */
        //below is the code which retrives data from html table and saves into excel file.
        $url='results/'.$target.'.html';
        include_once('dom.php');

        $html=file_get_html($url);

        $objPHPExcel = new PHPExcel();
        $objPHPExcel->getProperties()->....//set some properties//

        $record_find='first';

        foreach($html->find('table#GridView1') as $e){
            if($record_find=='first')
                $i=1;
            $j=0;

            foreach($e->find('tr') as $e1){
                $distno=trim($e1->find('td', 0)->innertext);
                $acno=trim($e1->find('td', 1)->innertext);
                $partno=trim($e1->find('td', 2)->innertext);
                $objPHPExcel->setActiveSheetIndex(0);
                $objPHPExcel->getActiveSheet()->SetCellValue('A'.$j, $distno);
                $objPHPExcel->getActiveSheet()->SetCellValue('B'.$j, $acno);
                $objPHPExcel->getActiveSheet()->SetCellValue('C'.$j, $partno);

                $j++;
            }
        }

        $objPHPExcel->getActiveSheet()->setTitle($target);

        $objWriter = new PHPExcel_Writer_Excel2007($objPHPExcel); 
        $objWriter->save('excelresult/'.$target.'.xlsx');

        $objPHPExcel->disconnectWorksheets();
        unset($objPHPExcel);

        $y++;
    }
    $x++;
} 
?>
Mark Baker
  • 209,507
  • 32
  • 346
  • 385
  • Thanks for your answer, I had tried above and also very fast, but here the last data also comes into new file. For example first iteration takes some data and save to one excel, after in next iteration it takes new data and also the past data and saves to next excel file.Please help me. – RadhaComEng Apr 10 '15 at 09:07
  • Well then rather than using the same PHPExcel object for every different workbook that you save, which means you have to wipe it when setting the new data before each curl request, why not create a new PHPExcel object for every curl request? – Mark Baker Apr 10 '15 at 09:13
  • I had added curl code in question. If I will move this line $objPHPExcel = new PHPExcel(); below include_once('dom.php'); then the curl function loads data for some values and after that it will not fetching data from site. – RadhaComEng Apr 10 '15 at 09:34
  • Thanks, You make my day. Thanks @ Mark Baker. Finally everything is working fine. – RadhaComEng Apr 10 '15 at 09:40