0

I have a name pattern looking like this:

F. O. O. Bar
F. Oobar
F. O. Obar

I'm currently trying to develop a regex that lets me split names in firstname, maybe initials and surname according to one of these.

foreach($authors as $author) {
    $arr = preg_split("/([a-zA-Z]. )+/", $author, -1, PREG_SPLIT_DELIM_CAPTURE);
    //Do stuff with $arr
}

However, this also splits Foo. Bar (or to be exact o.). The problem is that I cannot limit it to lowercase only, as the data I have incoming are VERY inconsistent, so I cannot rely on this.

memowe
  • 2,656
  • 16
  • 25
sonOfRa
  • 1,280
  • 11
  • 21
  • 1
    I just noticed you're using `preg_split` but you want to match? What *do* you really want? – Tim Pietzcker Sep 18 '12 at 10:38
  • 2
    @TimPietzcker match was the wrong choice of words, I do want to split actually. I have a list of names, and I need to split them in order to store them in a database that needs seperate entries for firstname(Initials) and surname. – sonOfRa Sep 18 '12 at 10:47

2 Answers2

3

The . has to be escaped.

$arr = preg_split("/([a-zA-Z]\. )+/", $author, PREG_SPLIT_DELIM_CAPTURE);
wroniasty
  • 7,884
  • 2
  • 32
  • 24
1

You mean you only want to allow one letter before the dot? Use a word boundary to ensure this:

$arr = preg_split("/\b([a-zA-Z]\. )+/", $author, PREG_SPLIT_DELIM_CAPTURE);

Also, as wroniasty correctly noted, the dot needs escaping.

Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561