3

I have a code that parses a csv file and the data also contains newline. But Text::CSV breaks when it encounters "\n" inside the data

This is the parse code

use Data::Dumper;
use Text::CSV;
my $csv = Text::CSV->new ({ binary=> 1, eol => $/, allow_loose_quotes => 1, allow_loose_escapes=> 1 }) || die $!;
#print Dumper($csv);                                                                                                                           

my $file = $ARGV[0];
open my $csv_handle,  $file  or die $!;
while (my $row = $csv->getline($csv_handle)) {
    print Dumper($row);
}

This is the data

196766,31,"MR SRINIVASALU LAKSHMIPATHY\"DEC\"\
\"71"
196766,56,"255233.47"
Ram
  • 1,155
  • 13
  • 34

1 Answers1

3

You also need to set the escape_char to \, as it defaults to ". However, this doesn't fix the problem if you run the pure-perl version of Text::CSV. With the XS version (Text::CSV_XS), this works:

use strict; use warnings;
use Text::CSV;
use Data::Dumper;

my $csv = Text::CSV->new({
    binary => 1,
    eol => "\n",
    quote_char => '"',
    escape_char => '\\',
    auto_diag => 2,
    allow_loose_escapes => 1,
}) or die "Can't create CSV parser";

while( my $row = $csv->getline(\*DATA) ) {
    print Dumper $row;
}

__DATA__
1,"2
",3
196766,31,"MR SRINIVASALU LAKSHMIPATHY\"DEC\"\
\"71"
196766,56,"255233.47"

The pure-Perl parser fails on the 2nd record and complains about a missing closing quote. If we set allow_loose_quotes to a true value, then the CSV parses, but the 2nd record is split apart (a third record with a sole field containing \"71" is inserted). The XS version does not show this behaviour.

This looks like a bug in Text::CSV_PP.

amon
  • 57,091
  • 2
  • 89
  • 149