Context:
I'm saving a .csv
file "keys" (the first line) into an array $CSV
to get a multidimensionnal array of the file.
The keys containing multiple words keep their 1st and last spaces as 1st and last character. The file is encoded in Windows-1252
which I convert to UTF-8
.
Process:
$keys = mb_convert_encoding($keys, 'UTF-8', 'Windows-1252');
$keys = trim(str_replace('"', ' ', $keys));
$keys = explode(';', $keys);
Results:
here are the firsts 2 keys, the 2nd one keeps its spaces.
Initial process (key => value):
[Commande] => C......
[ Date de création ] => 01/01/1970
Using urlencode(substr($keys[$j], 0, 1))
as value:
[Commande] => C
[ Date de création ] => +
Using rawurlencode(substr($keys[$j], 0, 1))
as value:
[Commande] => C
[ Date de création ] => %20
Using functions I found on other SO questions like preg_replace('/\xc2\xa0/', '', $keys)
always outputs %20
.
I could skip this issue or work differently but I don't understand why can't I trim()
these strings.
Full sample code:
$file = file(__DIR__ . '/path/to/' . $csv_file);
// Keys
$keys = mb_convert_encoding($file[0], 'UTF-8', 'Windows-1252');
$keys = trim(str_replace('"', ' ', $keys));
$keys = explode(';', $keys);
$CSV = [];
for ($i = 1; $i < count($file); $i += 1) {
$values = explode(';', $file[$i]);
for ($j = 0; $j < count($values); $j += 1) {
$values[$j] = mb_convert_encoding($values[$j], 'UTF-8', 'Windows-1252');
$values[$j] = trim(str_replace('"', ' ', $values[$j]));
$values = array_combine($keys, $values);
$CSV[] = $values;
}
}
die('<pre>' . print_r($CSV, true) . '</pre>');