7

I have some unicode codepoints (\u5315\u4e03\u58ec\u4e8c\u4e0a\u53b6\u4e4b), which I have to convert into actual characters they represent.

What's the simplest way to do so?

brian d foy
  • 129,424
  • 31
  • 207
  • 592
Peterim
  • 1,029
  • 4
  • 16
  • 25
  • No, I'm looking for code to convert them. – Peterim Apr 19 '10 at 13:14
  • Convert them where? As far it really sounds like as if you want to load some file, parse those codepoints, replace with actual glyphs/characters and then save the file. – BalusC Apr 19 '10 at 13:49
  • Sorry, but you make things complicated :) I don't need to load any files and save them, but I do need to convert unicode codepoint into the actual character, which I mentioned in my question. "Convert them where?" - obviously in the Perl script (convert a string with codepoints to the string with characters - easy). – Peterim Apr 19 '10 at 14:36

4 Answers4

7

Sometimes I'd just use pack:

binmode STDOUT, ':utf8';

my $string = '\\u5315\\u4e03\\u58ec\\u4e8c\\u4e0a\\u53b6\\u4e4b';

$string =~ s/\\u(....)/ pack 'U*', hex($1) /eg;

print $string;
brian d foy
  • 129,424
  • 31
  • 207
  • 592
4

Could Unicode::Escape be what you need?

brian d foy
  • 129,424
  • 31
  • 207
  • 592
izb
  • 50,101
  • 39
  • 117
  • 168
2
perl -C -E'say"\x{5315}\x{4e03}\x{58ec}\x{4e8c}\x{4e0a}\x{53b6}\x{4e4b}"'

or funny way

perl -C -E'say map chr hex, qw(5315 4e03 58ec 4e8c 4e0a 53b6 4e4b)'
Hynek -Pichi- Vychodil
  • 26,174
  • 5
  • 52
  • 73
1
use JSON::XS
print JSON::XS->new->decode('{"a":"\u5315\u4e03\u58ec\u4e8c\u4e0a\u53b6\u4e4b"}')->{a}