3

I have this word comЯade but I can't print it in HTML because of the Russian Я... I tried:

$HTML::Entities::char2entity{'Я'} = 'Я';  
$HTML::Entities::char2entity{'1071'} = 'Я';  
$HTML::Entities::char2entity{'ï'} = 'Я';  
$str = HTML::Entities::encode_entities( $str, q{Яï1071} );   

and after that I tried:

$str =~ s/1071/Я/g;
$str =~ s/Я/Я/g;
$str =~ s/ï/Я/g;    

But in both cases I get this error:

Wide character in syswrite at /usr/local/share/perl/5.10.1/Starman/Server.pm line 470.

Why?

Some code:

title.mi

<%init>
binmode STDOUT, ':encoding(UTF-8)';
($str =~ s/&/%26/g;); #this is working
$str =~ s/1071/&#1071;/g;
$str =~ s/Я/&#1071;/g;
$str =~ s/ï/&#1071;/g;
</%init>
<div class="bd-headline left">
<h1 style="margin-top:0; padding-top:0;"> <% $str %> </h1>
</div>

base.mc

<head>
     <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
mamesaye
  • 2,033
  • 2
  • 32
  • 49

3 Answers3

2

Problem 1:

If your source code is encoded using UTF-8, you didn't tell Perl as much by using use utf8;.

If your source code isn't encoded using UTF-8, it can't possibly have an "Я" in it.


Problem 2:

File handles can only transmit bytes, but you don't encode your Unicode characters into bytes. This is done by using a character encoding such as UTF-8. What encoding does your document specify it uses? Encode your output using it as follows:

binmode STDOUT, ':encoding(UTF-8)';
ikegami
  • 367,544
  • 15
  • 269
  • 518
  • tried adding <%init>binmode STDOUT, ':encoding(UTF-8)';%init> and but still have same error. what i am missing? – mamesaye Oct 11 '12 at 17:23
  • Not sure what that is, but that's not Perl you posted. You just repeated what you previously posted in the comment. – ikegami Oct 11 '12 at 18:31
  • I am using mason 2 (perl + html). I am receiving a string (title) from the DB and is printing it. – mamesaye Oct 11 '12 at 18:38
1

Escaping characters by replacing them with html entities is almost never the right thing to do.

It's possible the underlying server (catalyst?) is not unicode-aware. Searching CPAN berings up Catalyst::Plugin::Unicode::Encoding which may help.

evil otto
  • 10,348
  • 25
  • 38
1

Some code:

title.mi

<%init>  
        use Encode;
        my $hl = encode_utf8($str);  
        my $find = "&#1071;";   
        my $replace = "Я";  
        $hl =~ s/$find/$replace/g; 
        my $hs = HTML::Strip->new();
        my $no_html_hl = $hs->parse($hl); 
</%init>
<div class="bd-headline left">
            <h1 style="margin-top:0; padding-top:0;"> <% $no_html_hl %> </h1>
</div>

base.mc

<head>    </head>  

this link was helpful.

mamesaye
  • 2,033
  • 2
  • 32
  • 49