3

I'm using preg_split to split the following string:

$string = 'textarea name="custom_field" label="Space space space" column="1/2"';
$preg_split = preg_split("/\s(?![\w\s]+\")/", $string);
echo '<pre>',print_r($preg_split,1),'</pre>';

This code gives the following result:

Array
(
    [0] => textarea
    [1] => name="custom_field"
    [2] => label="Space space space"
    [3] => column="1/2"
)

Everything is working fine here.

However if I added Turkish characters with spaces it does not work as desired:

$string = 'textarea name="custom_field" label="âçğı İîöşüû" column="1/2"';
$preg_split = preg_split("/\s(?![\w\s]+\")/", $string);
echo '<pre>',print_r($preg_split,1),'</pre>';

It splits the middle of the string with the Turkish characters:

Array
(
    [0] => textarea
    [1] => name="custom_field"
    [2] => label="âçğı
    [3] => İîöşüû"
    [4] => column="1/2"
)

How can I detect the Turkish characters in preg_split and keep them in one array value? Like so:

Array
(
    [0] => textarea
    [1] => name="custom_field"
    [2] => label="âçğı İîöşüû"
    [3] => column="1/2"
)
The Bobster
  • 573
  • 4
  • 20

1 Answers1

5

Just use the 'u' modifier (for utf8 strings), like

$string = 'textarea name="custom_field" label="âçğı İîöşüû" column="1/2"';
$preg_split = preg_split("/\s(?![\w\s]+\")/u", $string);
echo '<pre>',print_r($preg_split,1),'</pre>';
pinaki
  • 5,393
  • 2
  • 24
  • 32