I'm using Strawberry Perl 5.18.2.2 on Windows 8.1 64bit with Text::CSV
installed, and
I'm trying to parse the following CSV file from PayPal.
Date, Time, Time Zone, Name, Type, Status, Gross, Fee, Net, From Email Address, To Email Address, Transaction ID, Counterparty Status, Address Status, Item Title, Item ID, Shipping and Handling Amount, Insurance Amount, Sales Tax, Option 1 Name, Option 1 Value, Option 2 Name, Option 2 Value, Auction Site, Buyer ID, Item URL, Closing Date, Escrow Id, Invoice Id, Reference Txn ID, Invoice Number, Custom Number, Receipt ID, Balance, Address Line 1, Address Line 2/District/Neighborhood, Town/City, State/Province/Region/County/Territory/Prefecture/Republic, Zip/Postal Code, Country, Contact Phone Number,
"5/5/2014","17:44:45","PDT","Jack Payer","Payment Received","Completed","1.00","0.00","1.00","jack@yahoo.com","payer@gmail.com","8HT05934290026J","Verified","","","","","","","","","","","","","","","","","","","","","1.67","","","","","","","",
open, read and parse the file using:
open(READ, $sourcefile)
open($fh,"+>:encoding(utf8)","$base.m.csv")
undef $/;
$_ = <READ>;
# convert Unix line ending to dos
$_ =~ s/\r?\n|\r/\r\n/g;
print $fh $_;
close READ;
$/ = "\r\n";
seek($fh, 0, 0);
my $csv = Text::CSV->new({ allow_whitespace => 1, binary => 1 }); # should set binary attribute.
my $row=$csv->getline ($fh);
my @fields = @$row;
$csv->column_names (@fields);
$row = $csv->getline_hr($fh)
Here is the problem, starting with the Gross column it's reading it incorrectly which then messes up all the subsequent column. $row->{'Gross'}
reads 1.00,^@.00
instead of 1.00
.
The ^@
symbol is a special NULL
character that I see when I open the output file with gvim. This is where the issue starts.
Subsequently, $gross=$row->{'Fee'}
reads 1.00
instead of 0.00
, and $gross=$row->{'Net'}
reads jack@yahoo.com
instead of 1.00
Any idea why it's getting messed up?
EDIT: FULL CODE AS REQUESTED here: https://www.dropbox.com/s/p064pxitmw3jwmj/csv2qif.pl
EDIT: If i change
"1.00","0.00","1.00"
in the CSV file to
"1.00","2.00","3.00"
it works fine. Infact anything starting with "0.xx" it ends up parsing it as a NULL character. I don't get the NULL character. I'm lost, is something wrong with the CSV parser?