-3

I'm currently importing CSV data and need to get it all nice and arrayed out.

Smaller Example Data is as follows.

"Name","Address"
"John Doe","5111 Fury Rd
Santa Cruz"
"Jane Doe","321 Tess St Texas"
"Josh Doe","653 1st St 
Orlando Florida
United States"

As you can see we need to split on line breaks outside of quotes as str_getcsv isn't multi-line.

I had originally used this expression.

$lines = preg_split('/[\r\n]{1,2}(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/',$data);

However preg_split crapped the bed when it was over XXXX amount of characters in the string.

So resorting to preg_match_all currently but need issues with the regex selector.

preg_match_all('/^(.*?)[\r\n]{1,2}(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))/', $data, $matches);

Currently that matches only the first instance.

Array(
    [0] => Array ( [0] => "Name","Address")
    [1] => Array ( [0] => "Name","Address")
)

Any clue to get it to return all the data in an array?

user1512593
  • 381
  • 3
  • 6
  • 16

2 Answers2

0

Here is one way to parse it. I have commented out the part that deletes new lines in the address. If you want it just remove the commenting.

$re = '/\"(.*?)\",\"(.*?)\"/s';
$data = '"Name","Address"
"John Doe","5111 Fury Rd
Santa Cruz"
"Jane Doe","321 Tess St Texas"
"Josh Doe","653 1st St
Orlando Florida
United States"';

preg_match_all($re, $data, $matches);

/*
foreach($matches[2] as &$value){
    $value = str_replace(PHP_EOL, " ", $value);
}
*/
var_dump($matches);

https://3v4l.org/7kRDt

Output with the foreach:

array(3) {
  [0]=>
  array(4) {
    [0]=>
    string(16) ""Name","Address""
    [1]=>
    string(36) ""John Doe","5111 Fury Rd
Santa Cruz""
    [2]=>
    string(30) ""Jane Doe","321 Tess St Texas""
    [3]=>
    string(53) ""Josh Doe","653 1st St
Orlando Florida
United States""
  }
  [1]=>
  array(4) {
    [0]=>
    string(4) "Name"
    [1]=>
    string(8) "John Doe"
    [2]=>
    string(8) "Jane Doe"
    [3]=>
    string(8) "Josh Doe"
  }
  [2]=>
  array(4) {
    [0]=>
    string(7) "Address"
    [1]=>
    string(23) "5111 Fury Rd Santa Cruz"
    [2]=>
    string(17) "321 Tess St Texas"
    [3]=>
    &string(40) "653 1st St Orlando Florida United States"
  }
}
Andreas
  • 23,610
  • 6
  • 30
  • 62
  • Thanks for the answer Andreas. Unfortunately, theres 100+ columns of data in each row. This would split the data into pairs. ie. [0]=>"First Name","Last Name" [1]=>"City","State"... etc – user1512593 Aug 15 '17 at 15:18
  • @user1512593 Then just add an appropriate amount of patterns. You can build the pattern with an for loop. `'/\"(.*?)\",\"(.*?)\",\"(.*?)\"/s';` <- is for three columns. Just keep adding the pattern over and over – Andreas Aug 15 '17 at 15:35
  • Thanks. I thought it was crazy talk repeating the same thing 100+ times but it works great Andreas. Thanks! – user1512593 Aug 15 '17 at 15:56
  • @user1512593 It's not a solution I'm proud of but it if it works... Keep in mind, if one entry does not have the starting or ending " it will fail miserably – Andreas Aug 15 '17 at 16:56
0

If you need to use preg_match_all() you could try this pattern to create an array of matches and then map str_getcsv() over the results; e.g:

<?php

$csvString = <<<CSV
"Name","Address"
"John Doe","5111 Fury Rd
Santa Cruz"
"Jane Doe","321 Tess St Texas"
"Josh Doe","653 1st St
Orlando Florida
United States
CSV;


preg_match_all('/(.*)(?:\n)/m', $csvString, $csvRows);

$csvData = array_map(function ($csvRow) {
    return str_getcsv($csvRow);
}, $csvRows[1]);

print_r($csvData);

Given your example input this yields:

Array
(
    [0] => Array
        (
            [0] => Name
            [1] => Address
        )

    [1] => Array
        (
            [0] => John Doe
            [1] => 5111 Fury Rd
        )

    [2] => Array
        (
            [0] => Santa Cruz"
        )

    [3] => Array
        (
            [0] => Jane Doe
            [1] => 321 Tess St Texas
        )

    [4] => Array
        (
            [0] => Josh Doe
            [1] => 653 1st St
        )

    [5] => Array
        (
            [0] => Orlando Florida
        )

)

Hope this helps :)

Darragh Enright
  • 13,676
  • 7
  • 41
  • 48