0

I hope this is not too specific a question... any thoughts are appreciated.

When someone fills out my contact form (UTF-8 encoded), the data correctly enters a MySQL database (UTF-8 encoded throughout) and a reply email is sent to the person who filled out the form (also UTF-8 encoded).

If the data is entered in English, all is good. If the data is entered in Japanese, the characters render correctly in the database, and the reply email (which takes their last and first names from the database, and is also completely written in Japanese) also renders correctly. All good, right?

On occasion though, the reply email renders the characters as Mojibake, even when sent to an address that usually renders kanji characters correctly.

I've been unable to replicate the error, but know it has happened as my client has sent a screenshot of the reply email. Has anyone else run into this problem? I'm at a bit of a loss. I use Sendmail software to send the emails.

Thanks

Coleen
  • 23
  • 9

1 Answers1

1

Try detecting if it has Japanese and if so, sending it using the typical Japanese encode. You would have to do this for all other languages that use Chinese characters - and maybe even Russian, etc. This stuff is a real pain..

function isKanji($str) {
     return preg_match('/[\x{4E00}-\x{9FBF}]/u', $str) > 0;
}

function isHiragana($str) {
    return preg_match('/[\x{3040}-\x{309F}]/u', $str) > 0;
}

function isKatakana($str) {
    return preg_match('/[\x{30A0}-\x{30FF}]/u', $str) > 0;
}

function isJapanese($str) {
    return $this->isKanji($str) || $this->isHiragana($str) || $this->isKatakana($str);
}

$userinputtext = "日本語を認識したいです!";

if (isJapanese($userinputtext)){

   mb_language("ja");
   $subject = mb_encode_mimeheader($subject,"ISO-2022-JP-MS");
   $body = mb_convert_encoding($body,"ISO-2022-JP-MS");
   $mail->CharSet = 'ISO-2022-JP';
   $mail->Encoding = "7bit";

   }
pb2q
  • 58,613
  • 19
  • 146
  • 147
  • Thanks for taking the time to write this out, I'll definitely give it a go. Regarding the CharSet though, isn't it best practice to keep it consistent throughout my database and scripting? I would ideally like to have UTF-8 render everything - could I change the IS0-2022-JP to UTF-8? And if so, would I also adjust the $mail->Encoding to something different? Again, thanks for your time. You're right, this mojibake problem is a bit of a nightmare that I've been working with (against?) for a couple of years. – Coleen Jan 26 '14 at 05:39