9

I have this code:

$string = 'علی';
echo strlen($string);

Since $string has 3 Persian characters, output must be 3 but I get 6.

علی has 3 characters. Why my output is 6 ?

How can I use strlen() in php for Persian with real output?

Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
user3932710
  • 125
  • 5
  • 3
    Use [`mb_strlen()`](http://php.net/mb-strlen) (from MBString extension). – BlitZ Sep 01 '14 at 06:31
  • 2
    Your output is 6 because `strlen()` counting bytes without considering of encoding. In your encoding (UTF8 probably) each character counts as 2 bytes. So, the output for 3 characters will be = 6 (`3 chars * 2 bytes`). – BlitZ Sep 01 '14 at 06:35
  • 1
    I ran `var_dump(mb_strlen('علی'));` myself, but the output is still 6 ? – bhargavg Sep 01 '14 at 06:36

5 Answers5

19

Use mb_strlen

Returns the number of characters in string str having character encoding (the second parameter) encoding. A multi-byte character is counted as 1.

Since your 3 characters are all multi-byte, you get 6 returned with strlen, but this returns 3 as expected.

echo mb_strlen($string,'utf-8');

Fiddle

Note

It's important not to underestimate the power of this method and any similar alternatives. For example one could be inclined to say ok if the characters are multi-byte then just get the length with strlen and divide it by 2 but that will only work if all characters of your string are multi-byte and even a period . will invalidate the count. For example this

echo mb_strlen('علی.','utf-8');

Returns 4 which is correct. So this function is not only taking the whole length and dividing it by 2, it counts 1 for every multi-byte character and 1 for every single-byte character.

Note2:

It looks like you decided not to use this method because mbstring extension is not enabled by default for old PHP versions and you might have decided not to try enabling it:) For future readers though, it is not difficult and its advisable to enable it if you are dealing with multi-byte characters as its not only the length that you might need to deal with. See Manual

Hanky Panky
  • 46,730
  • 8
  • 72
  • 95
9

try this:

function ustrlen($text)
{
    if(function_exists('mb_strlen'))
        return mb_strlen( $text , 'utf-8' );
    return count(preg_split('//u', $text)) - 2;
}

it will work for any php version.

Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
Power Man
  • 166
  • 1
  • 10
5

mb_strlen function is your friend

j123b567
  • 3,110
  • 1
  • 23
  • 32
4
$string = 'علی';
echo mb_strlen($string, 'utf8');
Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
dashtinejad
  • 6,193
  • 4
  • 28
  • 44
0

As of PHP5, iconv_strlen() can be used (as described in php.net, it returns the character count of a string, so probably it's the best choice):

iconv_strlen("علی");
// 3

Based on this answer by chernyshevsky@hotmail.com, you can try this:

function string_length (string $string) : int {
    return strlen(utf8_decode($string));
}

string_length("علی");
// 3

Also, as others answered, you can use mb_strlen():

mb_strlen("علی");
// 3

Notes

  • There is a very little difference between them (for illegal latin characters):

    iconv_strlen("a\xCC\r"); // A notice
    string_length("a\xCC\r"); // 3
    mb_strlen("a\xCC\r"); // 2
    
  • Performance: mb_strlen() is the fastest. Totally, there is no difference between iconv_strlen() and string_length() at performance. But amazingly, mb_strlen() is faster that both about 9 times (as I tested)!

MAChitgarha
  • 3,728
  • 2
  • 33
  • 40