I've migrated to a new hosting provider, with same freebsd system, and one of my perl scripts stopped working properly.
It downloads data from external https site and stores it in mysql db. Data is in cp1251 encoding, same encoding is in mysql base, tables and connection. From my.cnf:
character-set-server=cp1251
collation-server=cp1251_general_ci
init-connect="SET NAMES cp1251"
When connecting to mysql from perl script:
$dbh->do('SET CHARACTER SET cp1251');
So, I'm getting this data with
$ua = new LWP::UserAgent;
....
$res = $ua->get(....)
$s = $res->decoded_content();
Then script will parse this $s and insert result into mysql. When it does, encoding is corrupted!
Funny thing that I discovered is if I just write this data to a text file, then read it from this file and insert it into mysql - it's not corrupted!
When I view this text file I see that data is in cp1251 encoding.
What changed since previous hosting:
perl: from 5.10.1 to 5.14.4
libwww: from 5.835 to 6.05
mysql server is the same 5.1
UPDATE: Wow, just found something. If I replace $res->decoded_content() with $res->content(), everything works. Maybe that's because there's no charset in headers of the page I'm downloading.
I still don't understand how decoded_content messes with the string in such manner, that it looks like cp1251 but it isn't. Some utf flags maybe? Help plz.
UPDATE2: Here's the script (main parts):
#!/usr/bin/perl
use POSIX qw(strftime);
use LWP::UserAgent;
use HTTP::Headers;
use HTTP::Cookies;
use Digest::MD5 qw(md5_hex);
use DBI;
use common::sense;
no utf8;
no strict;
$ua = new LWP::UserAgent;
$hh = HTTP::Headers->new(
User-Agent => 'Mozilla/5.0 (Windows NT 5.1; rv:21.0) Gecko/20100101 Firefox/21.0',
Accept => 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
Accept-Language => 'en-us,en;q=0.7,ru;q=0.3',
Accept-Encoding => 'gzip, deflate',
Connection => 'keep-alive',
);
$ua->default_headers( $hh );
$ua->cookie_jar({});
$ua->timeout(20);
YMoney();
sub YMoney {
$res = $ua->get('...');
$res = $ua->post('...');
...
$res = $ua->get("...");
$s = $res->decoded_content();
@list = reverse split("\n", $s);
$dbh = DBI->connect("DBI:mysql:database=orders;host=localhost;port=3306", ....);
$dbh->do('SET CHARACTER SET cp1251');
for $line (@list) {
next if ($line !~ /^\+;/);
@pay{'data', 'amount', 'comment'} = map { s/"+//g; $_ } (split(';', $line))[1, 2, 5];
$pay{hash} = md5_hex( join('', @pay{'data', 'amount', 'comment'}) );
$id = $dbh->selectrow_array("SELECT id FROM ymoney WHERE hash = ?", {}, $pay{hash});
if (!$id) {
$dbh->do("INSERT INTO ymoney (operator, hash, data, amount, comment) VALUES ('yandex', ?, ?, ?, ?)", {},
$pay{hash}, DB_Date($pay{data}), DB_Amount($pay{amount}), $pay{comment}
);
}
}
}