2

I came across the following to split a string into "tokens":

$tokens = preg_split("/[^\-_A-Za-z0-9]+/", $string);

Could somebody explain to me how this is different from this:

$tokens = explode(' ', $string);

Any help would be greatly appreciated :-)

ekhumoro
  • 115,249
  • 20
  • 229
  • 336
Pr0no
  • 3,910
  • 21
  • 74
  • 121

2 Answers2

5

The regular expression you provided:

$tokens = preg_split("/[^\-_A-Za-z0-9]+/", $string);

will split an input string into tokens using a delimiter that is not a dash (-), underscore (_), letter (lowercase or uppercase), or number.

Whereas:

$tokens = explode(' ', $string);

Will only split the string into tokens using whitespace as a delimiter.

Ryan Berger
  • 9,644
  • 6
  • 44
  • 56
2

The literal reading of [^\-_A-Za-z0-9]+ is:

Match one or more induvidual characters that is not - or _ or a letter A to Z (capitalized or not) or a digit.

preg_split will split the input based on matches to the above, but explode will only split on a whitespace literal. There are other characters not excluded from the regular expression which preg_split will split on but explode won't, so the resulting arrays could be different.

Factor Mystic
  • 26,279
  • 16
  • 79
  • 95