How to skip duplicate (by header) CSV columns in League Reader?

Question

I'm using league/csv:^9.6 (9.6.2).

I need to parse CSV files that have lots of columns (150+), and extract small amount of them (8). Those columns are not fixed in terms of offset (position unknown), but they have stable unique header – so i want to select them by header, not by offset.

But here is the problem: library throws an exception about duplicate headers. Is there any solution how to just skip them and parse all other rows correctly?

Or any workaround how to strip them from file before using this library? Position and count of those duplicates is unknown beforehand.

Thanks!

Braker · Answer 1 · 2021-08-31T15:27:43.913

You could map the column names yourself. Do not set header offset, this will make League return the rows with integer keys.

Then use an array, containing the column names you want, to build a mapping of the indexes. I made a small working example:

test.csv

name;;;age;;;;a;a;a;;;;
John;;;32;;;;;;;;;;
Jane;;;28;;;;;;;;;;

test.php

//The columns you want to extract
$columns = ['name', 'age'];

$reader = Reader::createFromPath('test.csv', 'r+');
$reader->setDelimiter(';');
$grab = [];

//Find the indexes
foreach ($reader->fetchOne() as $index => $column) {
    if (in_array($column, $columns)) {
        $grab[$column] = $index;
    }
}

foreach ($reader->getRecords() as $i => $row) {
    if ($i == 0) {
        continue;
    }

    $filteredRow = [];
    foreach ($grab as $column => $index) {
        $filteredRow[$column] = $row[$index];
    }

    //$filteredRow now contains the needed columns
    var_dump($filteredRow);
}

Outputs:

array(2) {
  ["name"]=> string(4) "John"
  ["age"]=> string(2) "32"
}
array(2) {
  ["name"]=> string(4) "Jane"
  ["age"]=> string(2) "28"
}

How to skip duplicate (by header) CSV columns in League Reader?

1 Answers1