3

I am trying to convert standard ASCII letters to their full-width Japanese equivalents. For example:

Game becomes Game

I searched for an answer and I found this question with a good answer that I've quoted below:

$str = "Game some other text by ヴィックサ";
$str = preg_replace_callback(
"/[\x{ff01}-\x{ff5e}]/u",
function($c) {
    // convert UTF-8 sequence to ordinal value
    $code = ((ord($c[0][0])&0xf)<<12)|((ord($c[0][1])&0x3f)<<6)|(ord($c[0][2])&0x3f);
    return chr($code-0xffe0);
},
$str);

But I wanted it in the opposite direction. I tried changing the (-) sign to (+) in the return statementm, but didnt have much success.

Community
  • 1
  • 1
roullie
  • 2,830
  • 16
  • 26
  • 3
    There no reason to close this question, the asker has done *some* research, but isn't quite asking it correctly. –  Oct 17 '13 at 05:43
  • 1
    @LegoStormtroopr, that might be, but “I’ll just change a subtraction for an addition and see what happens” does not seem to hint that there’s any understanding of the underlying mechanisms … – CBroe Oct 17 '13 at 06:10

3 Answers3

0

This is simple using PHP's mb_convert_kana function. See http://php.net/manual/en/function.mb-convert-kana.php. You want at a minimum the R mode to convert "han-kaku" alphabets to "zen-kaku".

deceze
  • 510,633
  • 85
  • 743
  • 889
0

"/[\x{ff01}-\x{ff5e}]/u" is for detecting if the letter is a full width. You have to find a half width letter first. So I changed to "/[\x{0021}-\x{007e}]/u". The unicode table is here http://jrgraphix.net/r/Unicode/0020-007F

The second is about encoding/decoding problem I think. You converted UTF-8 sequence to ordinal value(ASCII code). That chr() function returns charater from ASCII. and ASCII has no full width letter. So you have to convert from unicode.

I use ord() first to get ASCII code of the character and Added 65248. Then convert decimal to hex and placed behind of "\u" and covered with commas so I can use json_decode().

$str = "Game some other text by ヴィックサ";
$str = preg_replace_callback(
    "/[\x{0021}-\x{007e}]/u",
    function($c) {
        return json_decode('"'.('\\u'.dechex (ord($c[0])+65248)).'"');
    }, $str);

I couldn't use mb_convert_kana(). I don't know why but I think it's because I worked with Korean strings, not Japanese.

I'm not good at English but I hope this explanation helps you.

체라치에
  • 173
  • 1
  • 16
  • 1
    Welcome to SO! While your answer might be working code that solves the question, it is even more helpful if you add additional information to it that helps the OP understand what he did wrong and how your code works. – Johannes H. Nov 14 '18 at 13:43
  • Ahh, Thank you! and sorry I'll add some explanation. @Johannes H. – 체라치에 Nov 15 '18 at 06:07
0

There's an easier way to do it:

$str = "Game";
// Becomes "Game"
$wideStr = mb_convert_kana($str, "R");
amphetamachine
  • 27,620
  • 12
  • 60
  • 72