0
$input = "žąsis su šešiolika žąsyčių";
preg_match_all("/\b(žąs\S*)/iu", $input, $output_array);
print_r($output_array);

returns a one big nothing. I want it to return both "žąsis" and "žąsyčių". Seems to be a simple problem but I can't find a simple answer. Should I encode both the subject and the pattern somehow or?..

And by meaning "one big nothing" I mean an empty multidimensional array

Array ( [0] => Array ( ) [1] => Array ( ) )
August
  • 490
  • 3
  • 5
  • 18

1 Answers1

1

Try adding a UTF8 sequence to the beginning of the pattern:

$input = "žąsis su šešiolika žąsyčių";
preg_match_all("/(*UTF8)(žąs\S*)/iu", $input, $output_array);
print_r($output_array);

EDIT:

I tested this on PHP 5.2.17 and 5.3.20... I don't seem to have any problems while using 5.3.20 but I do get the same empty output while using 5.2.17. While I couldn't find any documentation that addressed why this happens, the problem seems to go away when removing the first \b (word boundary). Here's a screenshot with the output, PHP version, loaded extensions, and source code (if this doesn't help, make sure you're saving your documents in UTF8 instead of whatever Windows likes to save them as):

enter image description here

jerdiggity
  • 3,655
  • 1
  • 29
  • 41