Here is a simpler pattern used by preg_split()
followed by preg_replace()
to fix the left and right double quotes up (Demo):
$in = '“Chess helps us overcome difficulties and sufferings,” said Unnikrishnan, taking my queen. “On a chess board you are fighting. as we are also fighting the hardships in our daily life.” he said.';
$out = preg_split('/ (?=“)/', $in, 0, PREG_SPLIT_NO_EMPTY);
//$out = preg_match_all('/“.+?(?= “|$)/', $in, $out) ? $out[0] : null;
$find = '/[“”]/u'; // unicode flag is essential
$replace = '"';
$out = preg_replace($find, $replace, $out); // replace curly quotes with standard double quotes
var_export($out);
Output:
array (
0 => '"Chess helps us overcome difficulties and sufferings," said Unnikrishnan, taking my queen.',
1 => '"On a chess board you are fighting. as we are also fighting the hardships in our daily life." he said.',
)
preg_split()
matches the space followed by a “
(LEFT DOUBLE QUOTE).
The preg_replace()
step requires a pattern with the u
modifier to make sure the left and right double quotes in the character class are identified. Using '/“|”/'
means you can remove the u
modifier, but it doubles the steps that the regex engine has to perform (for this case, my character class uses just 189 steps versus the piped characters using 372 steps).
Furthermore regarding the choice between preg_split()
and preg_match_all()
, the reason to go with preg_split()
is because the objective is to merely split the string on the space that is followed by a left double quote
. preg_match_all()
would be a more practical choice if the objective was to omit substrings not neighboring the delimiting space character.
Despite my logic, if you still want to use preg_match_all()
, my preg_split()
line can be replaced with:
$out = preg_match_all('/“.+?(?= “|$)/', $in, $out) ? $out[0] : null;