From what I understand, the perl
interpreter does not automatically decode command-line arguments from terminal. This means, that if they contain wide characters, then one should handle them like this:
#!/usr/bin/env perl
use I18N::Langinfo qw(langinfo CODESET);
use Encode qw(decode);
use Data::Dumper;
binmode STDOUT, ':utf8';
my $codeset = langinfo(CODESET); # returns 'UTF-8'
print Dumper \@ARGV;
foreach my $arg (@ARGV) {
push @new_ARGV, decode $codeset, $arg;
}
# Now, the strings inside @new_ARGV are in valid internal format
print Dumper \@new_ARGV;
The above script gives result as following (the terminal and locale are set with UTF-8
):
$ perl AC_351.pl śóą
$VAR1 = [
'ÅóÄ'
];
$VAR1 = [
"\x{15b}\x{f3}\x{105}"
];
Should I always decode command-line arguments like this or is there some setting or switch to tell Perl to handle this by itself?