8

I'm trying to loop through a 3mb Excel document, to get all the data I will then have to insert into the database. The worksheet I'm using has got 6500 rows, but it might vary in the future. I've noticed that even though I'm using recommended memory saving techniques, it still trips over

$reader = PHPExcel_IOFactory::createReaderForFile($file_path);
$reader->setReadDataOnly(true);

//$sheets = $this->getWorksheetNames($file['tmp_name'], 0);
$reader->setLoadSheetsOnly('spreadsheetname');

$chunkFilter = new IPO_Reader(); 
$reader->setReadFilter($chunkFilter); 

$highestRow    = 10000; //$this->objWorksheet->getHighestRow();
$chunkSize     = 1; 
$highestColumn = "Y";

for ($startRow = 2; $startRow <= $highestRow; $startRow += $chunkSize) 
{ 

    $chunkFilter->setRows($startRow, $chunkSize); 
    $objPHPExcel  = $reader->load($file_path); 

    for($row = $startRow ; $row <= $startRow + $chunkSize; $row++)
    {
        $this->read_row = $objPHPExcel->getActiveSheet()->rangeToArray('A'.$row.':'.$highestColumn.$row, null, true, true, true);

        $this->read_row = end($this->read_row);         

        foreach($this->read_row as $column => $value)
        {
            $db_column_name = $this->_getDbColumnMap($column);
            if(!empty($db_column_name))
            {
                $this->new_data_row[$db_column_name] = $this->_getRowData($value, $column);
            }   

        }

        $this->read_row = null;
        $this->new_data_row['date_uploaded']    = date("Y-m-d H:i:s");
        $this->new_data_row['source_file_name'] = $file_name;
        $ipo_row  = new Model_UploadData_IPO();
        $ipo_row->create($this->new_data_row);
        $this->new_data_row = null;
        unset($ipo_row);

        gc_collect_cycles();

    }
    $objPHPExcel->disconnectWorksheets(); 
    unset($objPHPExcel);    
    gc_collect_cycles();

when I test the memory usage before I unset the objPHPExcel and after, there is no memory gain, I'm really confused about it, as the split into chunks does not seem to allow me to clear the memory after each chunk, and the usage gradually rises, and with a limit set to 250MB, it only allows me to add ~500 records

pnuts
  • 58,317
  • 11
  • 87
  • 139
Kasia Gogolek
  • 3,374
  • 4
  • 33
  • 50
  • and this not last problem =) just take a look of http://stackoverflow.com/questions/6857075/problem-with-excell-export (eat 500MB) – Subdigger Aug 11 '11 at 10:12
  • sorry missed half of the sentence there. allows me to only add ~500 records – Kasia Gogolek Aug 11 '11 at 10:16
  • Kasia: you can edit your questions if you make a mistake – Mchl Aug 11 '11 at 11:07
  • thx Mchl, I don't seem to to be the only one with this issue: http://phpexcel.codeplex.com/discussions/267483 has exactly the same problem! – Kasia Gogolek Aug 11 '11 at 11:25
  • I've also had issues with PHPExcel failing on large files. I eventually decided to use COM interface to talk to Excel directly. It's tricky, requires you to run the script on Windows and have Excel installed... but at least it works. – Mchl Aug 11 '11 at 12:51
  • as for me - simple solution is to use html tables. ease & cheap – Subdigger Aug 11 '11 at 20:30
  • for upload or for display? I need to upload Excel spreadsheet as per clients requirement. and i don't want to display anything, just want to save stuff to the db – Kasia Gogolek Aug 12 '11 at 11:29

2 Answers2

3

The PHP excel library is known to have these memory issues, I had also problem with that. What worked for me was this advice (from the above link, try it, there are good advices how to reduce memory usage):

$objReader = new PHPExcel_Reader_Excel5();
$objReader->setReadDataOnly(true); /* this */

But anyway the memory requirements are big, because they allocate a lot of memory for each cell (for formatting etc., even if one doesn't need that). I'm afraid we are helpless until they release new version of the library.

Tomas
  • 57,621
  • 49
  • 238
  • 373
0

Ok, everyone knows trwtf is Excel, so may I ask if it's possible for you to convert this to CSV ?

I have my own CSV to table functions in PHP which have been used to import very large files, CSV tends to be much lighter to process and also much less prone to random library issues.

If you indeed need this for a one-time process or can go from XLS to CSV quite easily, please do so as it will make your life much easier (as everytime you stick to simpler, more standard alternatives ;) ).

And so for an API that will translate the oh-so-evil and dreadful XLS format, you can use one of the following o/s converters - I'd recommend python every time but hey, your choice :

http://www.oooninja.com/2008/02/batch-command-line-file-conversion-with.html

http://code.google.com/p/jodconverter/wiki/FAQ

Basically the idea is the same, you use an external tool in order to get a usable file format, and then you go from there.

I don't think I have my csvtotable.php script here, but it's quite easy to replicate, you just need to have a few basic tools like csvtoarray and then arraytoinsertstatements.

GL ;)

Morg.
  • 697
  • 5
  • 7
  • 1
    I would love to be just uploading CSV files, but unfortunately, that's not what the client wants. – Kasia Gogolek Sep 22 '11 at 15:32
  • Okay I have a solution for that too ... What about the bus ? - no kidding use the OpenOffice API, it can do that quite easily imo - links in main post – Morg. Sep 23 '11 at 08:19
  • At my last job, I also had a problem with PHPExcel and memory issues with large Excel files. While I was only reading them, I still had problems as some of our vendors were uploading 20MB+ files. If I remember correctly, I, too, used the OpenOffice API to open the file and convert it to CSV, then used the new file to iterate through and insert records into my DB. I had already offered upload capabilities for OpenOffice Calc files to our users, so I had the code, already. If you get this working using OpenOffice, we'd love to see your solution. – MB34 Nov 23 '11 at 14:51