9

i'm echoing japanese characters fine but when i try to substr and echo out part of the string it just turn to question marks ���

note: i set my header to utf-8

header('Content-Type: text/html; charset=utf-8');

and made the meta <meta http-equiv="Content-type" content="text/html; charset=utf-8" />

$word = "せんせい";
echo $word;       //works just fine

echo substr($word,-1);    //now it just echoes �

//this one also failed
echo $word[0];    //echoes �
  • 1
    Please understand what the header does, it simply makes a claim that the content you give is encoded in `UTF-8`. Which it is not because `substr` mutilates the bytes to invalid `UTF-8` – Esailija Jul 31 '12 at 10:19

3 Answers3

10

When working with your multibyte strings, you'll need to use the multibyte string functions, in this case mb_substr.

Michael Robinson
  • 29,278
  • 12
  • 104
  • 130
3

Try multibyte substrings, mb_substr() info found here This function is made for characters not in the english ascii code set.

Branden S. Smith
  • 1,161
  • 7
  • 13
2
mb_substr

will work. But, remember to add the following line at the top of your script:

mb_internal_encoding("UTF-8");//Sets the internal character encoding to UTF-8, for mb_substr to work
dev4life
  • 10,785
  • 6
  • 60
  • 73