0

I'm looking at importing a CSV file, but this file is rather large.

What I'm looking to do, is 2 things:

1) Scan the CSV to validate values in particular fields
2) Once the file is valid, import

The CSV data can ONLY be inserted if the file is valid (All or nothing)

The problem is, I'm looping through twice, 1st to check the CSV fields are valid and then another for loop to save.

The problem is memory. I run out of memory (file is 100,000 rows with 45 fields)

Is there an easier way to do this and reduce memory? I'm using an AR implementation, would using PDO be quicker?

Thanks

EDIT:

       $data = array();
        // open the file and loop through
        if( ($handle = fopen('details.csv', "r")) !== FALSE) {
            $rowCounter = 0;
            while (($rowData = fgetcsv($handle, 0, ",")) !== FALSE) {
                if( 0 === $rowCounter) {
                    $headerRecord = $rowData;
                } else {
                    foreach( $rowData as $key => $value) {
                        $data[ $rowCounter - 1][$headerRecord[ $key] ] = $value;
                    }
                }
                $rowCounter++;
            }
            fclose($handle);
        }

        $errors = array();
        // loop to check through the fields for validation
        for($i=0;$i<count($data);$i++) {
            $row++;

            if(!valid_email($data[$i]['EMAIL']))) {
               $errors[] = 'Invalid Email Address';
               break;
            }

        }

        if(empty($errors)) {
          for($j=0;$j<count($assocData);$j++) {
             $row++;

             $details = new Details();

             // set the fields here
             $details->email = $data[$j]['EMAIL'];

             $details->save();
             unset($details); 
          } 
        }
sipher_z
  • 1,221
  • 7
  • 25
  • 48
  • You could use `fopen` and `fgets` and parse it manually one line at a time. What do you mean "is valid"? – h2ooooooo Jan 20 '14 at 12:22
  • We probably have to see your code in order to help you reduce memory usage. Also, you should be inserting in a single transaction rather than row by row. – mcryan Jan 20 '14 at 12:22
  • @h2ooooooo I have validation in terms of fields (must be a number starting with 99 and be 15 chars long etc etc). I'm using `fopen` and `fgetcsv` to open and parse – sipher_z Jan 20 '14 at 12:26
  • @mcryan I have included some code – sipher_z Jan 20 '14 at 12:36

1 Answers1

0

You're already looping through the data in your first foreach. Why don't you validate the fields inside that loop, if validation passes add to an array to save and only save (in a single transaction) if the loop completes.

mcryan
  • 1,566
  • 10
  • 20
  • What is the performance though of saving such data in a single transaction? The file could increase to 1,000,000 rows plus! – sipher_z Jan 20 '14 at 13:09
  • That will also mean you'll execute 1,000,000 operations on the DB when you could do it all in 1! Looping through the results multiple times is also totally unnecessary and definitely a performance killer. – mcryan Jan 20 '14 at 13:51