PHP: Remove a row from a CSV if ID occurs multiple times and a column has a specific value

Question

I have data as below in a CSV file. The green lines are what I'd like to keep.

Basically if someone has a single line, I want to keep it.

If they have multiple lines, I want to remove any where the 3rd column is A - Fully Fit.

After running through the whole file, I then want to save it over the original.

I tried writing the code but not sure if it's a Friday but the logic is escaping me at the moment.

There could be just a single line or there could be dozens. So if an ID has 1000 rows, I would just want to delete the rows for that ID where the 4th column is "A - Fully Fit"

UPDATE

This code works but not sure if it's the most optimal

// DECLARE ARRAYS
$a = Array();
$b = Array();
$c = Array();

// LOOP THROUGH FILE AND COLLECT DUPLICATES
if (($handle = fopen($filename, "r")) !== FALSE) {      

    while (($line = fgetcsv($handle, 4096, ",")) !== FALSE) {
        $a[$line[1]][] = 'SHAKKA';
    }

    foreach ($a as $k=>$v) {
        if (count($v)>1) {
            $b[] = $k;
        }
    }
}

// IF A DUPLICATE THEN REMOVE ROWS THAT ARE 'A - Fully Fit'
if (($handle = fopen($filename, "r")) !== FALSE) {      
    while (($golly = fgetcsv($handle, 4096, ",")) !== FALSE) {
        if (in_array($golly[1],$b) && $golly[3] == 'A - Fully Fit')  {
        }
        else {
            $c[] = $golly;
        }
    }
}
fclose($handle);

// WRITE THE FILE
$fp = fopen($filename, 'wa+');
foreach ($c as $fields) {
    fputcsv($fp, $fields);
}
fclose($fp);

The green lines have 3rd column as `A - Fully Fit` but you want to keep green lines, yet remove rows that have 3rd column as `A - Fully Fit`? — nice_dev, Oct 11 '19 at 10:09
Ok, so the question is every row needs to have unique ID(2nd column) and unique value in 4th column? — nice_dev, Oct 11 '19 at 10:17
No - if an ID based off column 2 has multiple records (2 or more), then the record with the 4th column having "A - Fully Fit" needs to be deleted — pee2pee, Oct 11 '19 at 10:19
"Basically if someone has a single line, I want to keep it.". I thought the combination of column 6 and 7 had to be unique. — Gerard, Oct 11 '19 at 10:22
@pee2pee So the combination of 2nd column(ID) and 4th column needs to be unique? Like from `[16 - A - Fully Fit,16 - A - Fully Fit,18 - A - Fully Fit]`, only these `[16 - A - Fully Fit,18 - A - Fully Fit]` needs to survive? — nice_dev, Oct 11 '19 at 10:27
The first line in the screenshot - that person only has one line, so I want to keep it. The next person has two lines. Based off the fact one of those has "A - Fully Fit" in there, I'll delete the second row and keep the third. Same for the 4th and 5th. They have two rows but since one has A - Fully Fit, I'll delete that and keep the other(s) — pee2pee, Oct 11 '19 at 10:28
@pee2pee So, only unique values of 4th column needs to be present? If a row has a value in the 4th column already visited, delete them. Right? — nice_dev, Oct 11 '19 at 10:30
No - sorry not sure how to explain it anymore but I'll give it a go. If your ID has 100 records in there, I just want to delete any record where A - Fully Fit is in the 4th column. If you have a single row, I would just leave it. — pee2pee, Oct 11 '19 at 10:40

nice_dev · Accepted Answer · 2019-10-11T12:10:40.947

1

<?php


if (($handle = fopen($filename, "r")) !== FALSE) {      
    $new_rows = [];
    $hash_ids = [];

    while (($line = fgetcsv($handle, 4096, ",")) !== FALSE) {
        if($line[3] === 'A - Fully Fit'){
            if(isset($hash_ids[$line[1]])) continue;
            $hash_ids[$line[1]] = true;
        }
        $new_rows[] = $line;
    }

    fclose($handle);

    // WRITE TO THE FILE
    $fp = fopen($filename, 'wa+');
    foreach ($new_rows as $current_row) {
        fputcsv($fp,$current_row);
    }
    fclose($fp);
}

In the above code, we maintain a hash(associative array) for each ID. If we have already come across with an ID with A - Fully Fit, then we skip the row, else we add it to the list.

edited Oct 11 '19 at 12:10

answered Oct 11 '19 at 11:18

nice_dev

17,053
2
21
35

Once the flag is set, it never resets so wouldn't that mean anyone else who is fully fit and has one row, they will be excluded? – pee2pee Oct 11 '19 at 11:22
@pee2pee So, only one occurrence of `A - Fully Fit` per ID? Is this the question? – nice_dev Oct 11 '19 at 11:23
No. Sorry, I can't explain it anymore simply than I have already done and with examples and I've now done the code myself. Thanks for trying to help. – pee2pee Oct 11 '19 at 11:31
If an ID has 1,000,000 rows, I would just want to delete the rows for that ID where the 4th column is "A - Fully Fit". If an ID had just one single row where the 4th column was "A - Fully Fit" I would want to keep it – pee2pee Oct 11 '19 at 11:35
@pee2pee I have edited my answer. Should work according to your needs. – nice_dev Oct 11 '19 at 12:11

PHP: Remove a row from a CSV if ID occurs multiple times and a column has a specific value

1 Answers1