0

From what I understand, the perl interpreter does not automatically decode command-line arguments from terminal. This means, that if they contain wide characters, then one should handle them like this:

#!/usr/bin/env perl

use I18N::Langinfo qw(langinfo CODESET);
use Encode qw(decode);
use Data::Dumper;

binmode STDOUT, ':utf8';

my $codeset = langinfo(CODESET); # returns 'UTF-8'

print Dumper \@ARGV;

foreach my $arg (@ARGV) {
    push @new_ARGV, decode $codeset, $arg;
}

# Now, the strings inside @new_ARGV are in valid internal format
print Dumper \@new_ARGV;

The above script gives result as following (the terminal and locale are set with UTF-8):

$ perl AC_351.pl śóą
$VAR1 = [
          'ÅóÄ'
        ];
$VAR1 = [
          "\x{15b}\x{f3}\x{105}"
        ];

Should I always decode command-line arguments like this or is there some setting or switch to tell Perl to handle this by itself?

Grzegorz Szpetkowski
  • 36,988
  • 6
  • 90
  • 137

0 Answers0