-1

I am parsing an Excel CSV file in OSX El Capitan, the CSV is here

The problem is newlines are marked as '\x0d' (CR).

1] I have been able to transform the file with newlines set as 'x0a' (NL) with

$> perl -e 'open(fh, "<coord2.csv") or die("can t open file"); 
   binmode(fh); $/=\10; 
   while ($t=<fh>) { $t =~ s/\x0d/\x0a/g; print "$t"; } print "\n"; '

but before that i tried two other way and they failed, i would like to know if you have some exmplanation for the reason they fail.

2] I tried for a while to slurp all the file at once with:

$> cat coord2.csv | perl -e 'undef $/; $t=<>; print "$t \n"; '

but in output I don't see all the file, I see only:

Prelevacampioni ZARA;45.0640, 11.1943;F ;F 9.954467;F00;F599242;F

3] I tried also to set the $/ variable, with:

$> cat coord2.csv | perl -e '$/="\x0d"; @L=<>; 
                       for (@L) { print $_; } print "\n"; '

but again, I see only the same output as in point [2].

I am puzzled, can you explain me what I am doing wrong in point [2] and [3] ?

Nicola Mingotti
  • 860
  • 6
  • 15
  • 1
    Your third example has syntax errors. The quotes are broken. Having `\r` as the line ending character is normal on MacOS. If you run this on the same mac, you can just use `chomp` and Perl will do the right thing. But then it seems like you want to read the whole file at once, so why bother with line endings anyway? – simbabque Aug 30 '17 at 10:41
  • [1] Thank you for the syntax correction, but results are the same after it. [2] About line ending I disagree with you, if you run `ls -1 | hexdump -C` you can see line ending are encoded as NL not CR. [3] I put the chomp before `print $_` and now it prints all lines ! Great, now I understood what happened. Thank you. – Nicola Mingotti Aug 30 '17 at 13:00

1 Answers1

0

Thank to a comment I was able to understand that strange behaviour in examples [2] and [3]. I write here the solution for people that may in future have the same problem.

The point is that OSX encodes newlines with NL ='\n' = 'x0a' and in the CVS file line endings are encoded with CR = '\r' = 'x0d'. Perl was reading lines correctly but the output was mangled because the CR was keeping Perl writing always on the same line.

So, the corrected code for examples [2] and [3] is:

[2] $> cat coord2.csv | perl -e 'undef $/; $t=<>; 
           $t =~ s/\x0d/\x0a/g; print "$t \n"; '


[3] $> cat coord2.csv | perl -e '$/="\x0d"; @L=<>; 
          for (@L) { s/\x0d/\x0a/g; print $_; } print "\n"; '
Nicola Mingotti
  • 860
  • 6
  • 15