-1

Context:

I'm saving a .csv file "keys" (the first line) into an array $CSV to get a multidimensionnal array of the file.
The keys containing multiple words keep their 1st and last spaces as 1st and last character. The file is encoded in Windows-1252 which I convert to UTF-8.

Process:

$keys = mb_convert_encoding($keys, 'UTF-8', 'Windows-1252');
$keys = trim(str_replace('"', ' ', $keys));
$keys = explode(';', $keys);

Results:

here are the firsts 2 keys, the 2nd one keeps its spaces.

Initial process (key => value):

[Commande] => C......
[ Date de création ] => 01/01/1970

Using urlencode(substr($keys[$j], 0, 1)) as value:

[Commande] => C
[ Date de création ] => +

Using rawurlencode(substr($keys[$j], 0, 1)) as value:

[Commande] => C
[ Date de création ] => %20

Using functions I found on other SO questions like preg_replace('/\xc2\xa0/', '', $keys) always outputs %20.

I could skip this issue or work differently but I don't understand why can't I trim() these strings.

Full sample code:

$file = file(__DIR__ . '/path/to/' . $csv_file);
// Keys
$keys = mb_convert_encoding($file[0], 'UTF-8', 'Windows-1252');
$keys = trim(str_replace('"', ' ', $keys));
$keys = explode(';', $keys);

$CSV = [];

for ($i = 1; $i < count($file); $i += 1) {
    $values = explode(';', $file[$i]);
    for ($j = 0; $j < count($values); $j += 1) {
        $values[$j] = mb_convert_encoding($values[$j], 'UTF-8', 'Windows-1252');
        $values[$j] = trim(str_replace('"', ' ', $values[$j]));
        $values = array_combine($keys, $values);
        $CSV[] = $values;
    }
}
die('<pre>' . print_r($CSV, true) . '</pre>');
AymDev
  • 6,626
  • 4
  • 29
  • 52
  • 2
    I'm very unclear how these code snippets and that output are connected exactly. Can you produce a sample that is self-contained, which we can execute and clearly connect input to output? – deceze Apr 27 '18 at 12:09
  • Also, when inspecting what a string consists of, `bin2hex` is essential. – deceze Apr 27 '18 at 12:09
  • 1
    `trim()` will not remove spaces in between. Just do:-`$keys =str_replace('"', '', $keys);` – Alive to die - Anant Apr 27 '18 at 12:11
  • @deceze I did not know `bin2hex` could help me. I'm pasting the sample code. – AymDev Apr 27 '18 at 12:12
  • @AlivetoDie I know that and I never mentionned I want to remove all the spaces, I want to do what trim() is made for. – AymDev Apr 27 '18 at 12:18
  • ok got your point. do :-`$keys =str_replace(array('"',' '), ' ', $keys);` and check – Alive to die - Anant Apr 27 '18 at 12:22
  • The "full code" doesn't account for how `$keys` was created…!? – deceze Apr 27 '18 at 12:26
  • 1
    if I understand you correctly, you have at some point a string `$keys` as `_item1_;_item2_;_item3_..` (underscore instead of space). if you trim that string, it will indeed only trim the first and last space, not the intermediate spaces – cypherabe Apr 27 '18 at 12:33
  • @AlivetoDie your code removed spaces but it also removed spaces in between. – AymDev Apr 27 '18 at 12:58
  • @deceze sorry, little mistake while copying to my question. `$keys` is `$file[0]` converted to UTF8 – AymDev Apr 27 '18 at 13:01
  • So your question is why you get `[ Date de création ]`…? – deceze Apr 27 '18 at 13:03
  • @cypherabe sorry, context was unclear. It keeps spaces that should be removed by `trim()` – AymDev Apr 27 '18 at 13:04
  • @deceze Yes, it should be `[Date de création]`. @AlivetoDie comment removed spaces but all of them. – AymDev Apr 27 '18 at 13:06

1 Answers1

2
$keys = trim(str_replace('"', ' ', $keys));
$keys = explode(';', $keys);

Presumably you're starting with this line:

Commande;"Date de création";"Something something"

You're then turning it into this line (you're introducing the spaces here):

Commande; Date de création ; Something something 

Which you're then trimming (removing the spaces at the start and end of the line):

Commande; Date de création ; Something something

And then you're exploding the line:

array('Commande', ' Date de création ', ' Something something')

  1. You need to trim each individual value after you have exploded the line, not before:

    $keys = array_map('trim', $keys);
    
  2. You should use CSV-parsing functions to parse CSVs, not re-invent the wheel:

    $keys = str_getcsv($file[0], ';');
    
  3. You should parse the entire CSV file using fgetcsv for more efficiency:

    function read_and_convert_line($fh) {
        $values = fgetcsv($fh, 0, ';');
        if (!$values) {
            return false;
        }
    
        return array_map(
            function ($value) { return mb_convert_encoding($value, 'UTF-8', 'Windows-1252'); }, 
            $values
        );
    }
    
    $fh = fopen(__DIR__ . '/path/to/' . $csv_file);
    $headers = read_and_convert_line($fh);
    $data = [];
    
    while ($row = read_and_convert_line($fh)) {
        $data[] = array_combine($headers, $row);
    }
    
    fclose($fh);
    print_r($data);
    

    This should eliminate the need for trim entirely.

deceze
  • 510,633
  • 85
  • 743
  • 889