0

I am having problem while reading 3Mb data .xlsx file and same for 7Mb data .xls file. Is there any size limitations while reading file?

In my Excel file, I have 30,000 rows and 36 rows. Is there any solutions so that I can read up to 100K records or more then that?

In my project I have to import 1 million records, but my code is not working for more than 29000 records. Up until 29000 records my code works on my local.

And also reading 29000 records takes too much, time may be 25 min.

Can anyone please explain why this happens, and what should I do to resolve this?

Here is my code:

<?php
    error_reporting(E_ALL);
    set_time_limit(0);
    ini_set("memory_limit","-1");
    date_default_timezone_set('Europe/London');
    define('EOL',(PHP_SAPI == 'cli') ? PHP_EOL : '<br />');

   /**  Set Include path to point at the PHPExcel Classes folder  **/
   set_include_path(get_include_path() . PATH_SEPARATOR . 'Classes/');

  /**  Include PHPExcel_IOFactory  **/
  include 'Classes/PHPExcel/IOFactory.php';

  $inputFileName = 'files/30000rows.xls';
  $inputFileType = PHPExcel_IOFactory::identify($inputFileName);

 /**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter */
   class chunkReadFilter implements PHPExcel_Reader_IReadFilter
   {
      private $_startRow = 0;
      private $_endRow = 0;
      /**  Set the list of rows that we want to read  */
      public function setRows($startRow, $chunkSize) {
       $this->_startRow = $startRow;
       $this->_endRow = $startRow + $chunkSize;
     }
     public function readCell($column, $row, $worksheetName = '')
    {
          if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow))
         {
            return true;
         }
        return false;
    }
  }
  echo 'Loading file ',pathinfo($inputFileName,PATHINFO_BASENAME),' using IOFactory with a defined reader type of ',$inputFileType,'<br />';
   /**  Create a new Reader of the type defined in $inputFileType  **/
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    echo '<hr />';
    /**  Define how many rows we want to read for each "chunk"  **/
    $chunkSize = 1000;
   //total rows in excel
   $spreadsheetInfo = $objReader->listWorksheetInfo($inputFileName);
   $totalRows = $spreadsheetInfo[0]['totalRows'];
   /**  Create a new Instance of our Read Filter  **/
   $chunkFilter = new chunkReadFilter();
   /**  Tell the Reader that we want to use the Read Filter that we've  Instantiated  **/
  $objReader->setReadFilter($chunkFilter);
  $objReader->setReadDataOnly(true);
  /**  Loop to read our worksheet in "chunk size" blocks  **/
 for ($startRow = 2; $startRow <= $totalRows; $startRow += $chunkSize) {
    echo "in for loop<br>";
    echo 'Loading WorkSheet using configurable filter for headings row 1 and     for rows ',$startRow,' to ',($startRow+$chunkSize-1),'<br />';
     /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/

     $chunkFilter->setRows($startRow,$chunkSize);

     $cacheMethod = PHPExcel_CachedObjectStorageFactory:: cache_to_phpTemp;
     $cacheSettings = array( ' memoryCacheSize '  => '1000MB');
    PHPExcel_Settings::setCacheStorageMethod($cacheMethod, $cacheSettings);

    $cacheMethod=PHPExcel_CachedObjectStorageFactory::cache_in_memory_serialized;
    PHPExcel_Settings::setCacheStorageMethod($cacheMethod);

    $cacheMethod = PHPExcel_CachedObjectStorageFactory::cache_in_memory_gzip;
    if (!PHPExcel_Settings::setCacheStorageMethod($cacheMethod)) {
        die($cacheMethod . " caching method is not available" . EOL);
    }
    echo date('H:i:s') , " Enable Cell Caching using " , $cacheMethod , "   method" , EOL;


     /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
     $objPHPExcel = $objReader->load($inputFileName);
     $objWorksheet = $objPHPExcel->getActiveSheet();
     $highestColumn = $objWorksheet->getHighestColumn();
     $sheetData = $objWorksheet- >rangeToArray('A'.$startRow.':'.$highestColumn.($startRow + $chunkSize-1),null, false, false, true);
     echo '<pre>';
     print_r($sheetData);
     $objPHPExcel->disconnectWorksheets();
     unset($objPHPExcel);
     echo '<br /><br />';
    }
 ?>
halfer
  • 19,824
  • 17
  • 99
  • 186
  • Is there any errors? Try not to print that much information while processing data. Try optimize code as much as possible. Have you tried to do same thing via CLI? – Justinas Feb 20 '15 at 06:23
  • According to [this discussion](http://stackoverflow.com/q/4895230/365237), the only phpexcel limits are around "65,536 rows and 256 (IV) columns", which would suggest memory or timeout limit being hit. You should probably try examine if you hit memory limits of the machine or run through cli like @Justinas suggested. – eis Feb 20 '15 at 06:24
  • Also, if you print that much data during for loop, it will surely slow you down. Try commenting the debug printouts off (or log and buffer them to a file). – eis Feb 20 '15 at 06:30
  • Of course using `$objWorksheet- >rangeToArray()` is also going to use a lot of memory, because you're loading a lot of data into a large PHP array, it's always better to process row by row if you can do so – Mark Baker Feb 20 '15 at 12:34
  • And which caching method are you actually using, you set three.... the last being gzipped in memory, have you tried checking which method is the most efficient for your PHP version and configuration? – Mark Baker Feb 20 '15 at 13:44
  • I've downvoted for (a) requesting urgency, (b) deliberately adding begging messages again after they were removed, and (c) not replying to the answer you received. – halfer Oct 28 '15 at 19:28

1 Answers1

1

To read XLSX files, I can recommend you to use Spout. It makes it super simple to deal with large files. Here is how you would do it:

$reader = ReaderFactory::create(Type::XLSX);
$reader->open($filePath);

while ($reader->hasNextSheet()) {
    $reader->nextSheet();

    while ($reader->hasNextRow()) {
        $row = $reader->nextRow();
        // do stuff
    }
}

$reader->close();

This works for any file, regardless of the file size. No need to worry about caching, filtering, memory consumption. It will require less than 10MB of memory and should take less than a minute to process the entire file.

Adrien
  • 1,929
  • 1
  • 13
  • 23
  • Hi, Adrien , I have the same issue as stated above. Can you please suggest me the library that would work for XLSX as well as the XLS files???? – ba1ar Feb 23 '15 at 07:40
  • Hey @ba1ar, as far as I know, there are no PHP libraries that can process both large XLSX and XLS files. It's a very complex task that requires a lot of optimizations so it's not trivial to create such a library. Is Spout working for the XLSX file at least? – Adrien Feb 23 '15 at 16:27
  • Thanks Adrien, I have read the documentation, it is stated that it supports XLSX. So I tried to search out the library that supports both XLS & XLSX files, but not succeed. So I ask you for the same if you have something to help me out. – ba1ar Feb 25 '15 at 05:31
  • 1
    The only library that would work is PHPExcel (the one you started using). There may be more optimizations you can make but I see you are already using a bunch of them (cell caching, reading data in chunks, setReadDataOnly). Unfortunately I don't know any other alternatives that could solve your problem. You can still submit a pull request to Spout if you want to add support for XLS files and use it for XLSX – Adrien Feb 25 '15 at 16:36
  • Here is worked solution for me http://stackoverflow.com/questions/2966439/reading-large-excel-file-with-php/36486988#36486988 – Harsha Apr 07 '16 at 20:54