I'm experiencing some weird, system dependent issues with the Text::Unaccent module. Apologies if I'm missing something silly, but I've been banging my head against this one for hours with no real progress.
I have a simple script set up that shows the problem reasonably well.
#!/usr/bin/perl
use utf8;
use strict;
use warnings;
use Text::Unaccent;
my $string = 'aaâaa';
my $unacd = unac_string("UTF-8", $string);
print "Accented: $string \n";
print "Unaccented: $unacd \n";
The output on my production server looks great:
[user@prod]$ perl test_unaccent.pl
Accented: aaâaa
Unaccented: aaaaa
The output on my development server looks strange:
[user@dev]$ perl test_unaccent.pl
Accented: aaâaa
Unaccented: UTF-8
It just prints out the charset I pass to the unac_string call.
I've checked the locale settings, tried ensuring iconv is working properly (unac_string_utf16 seems to work), but I just can't figure out what could be the problem.
The dev and prod servers are definitely different in a few key ways, but I can't see how it's relevant.
prod: CentOS 5, Perl 5.8.8
dev: CentOS 6, Perl 5.10.1
Thanks in advance for any suggestions/thoughts!