2

I have been trying several ways to solve my issue and have found a poor work-around, but I would like to know if there is something else out there. I have a string of several sub-strings which are separated by commas. I can split this up into an array using preg_split or explode. BUT some of the sub-strings also contain commas that I do not want to split into separate array members. My work around is to include a full stop at the end of every string and then tell explode to split only on ".,". Example string:

$string = "Henry the horse, Billy the donkey, Harry the mule, George, the hippo";

Work-around

$string = "Henry the horse., Billy the donkey., Harry the mule., George, the hippo.";
$list = explode('.,',$string);

I can't for the life of me think of any way to tell the program that the comma after George is not the end of the sub-string. Another (related) issue is that I would like to split the string at the commas BUT include the commas in the array members.

==> Henry the horse,
==> Billy the donkey,
==> Harry the mule,
==> George, the hippo,

My idea for this is simply to add them again after. Is there a simpler way? In other words is there a way of splitting at a delimiter BUT keeping the delimiter in the array members?

Jan
  • 42,290
  • 8
  • 54
  • 79
charco
  • 71
  • 6
  • If you don't know the logic on how to decide whether a comma is part of the substring or a delimiter, then it will be impossible to tell PHP to do it. So what is the logic? – trincot Nov 25 '18 at 09:58
  • Any feed-back on my comment, or on the answers below? – trincot Nov 25 '18 at 19:30

3 Answers3

1

I'm guessing that each substring must start with a capital. Then this would do it:

$string = "Henry the horse, Billy the donkey, Harry the mule, George, the hippo";

preg_match_all("~[A-Z].*?(?:$|,)(?!\s*[a-z])~", $string, $result);

$result[0] will contain the following output:

[
    "Henry the horse,"
    "Billy the donkey,"
    "Harry the mule,"
    "George, the hippo"
]
trincot
  • 317,000
  • 35
  • 244
  • 286
  • Yes, the string originates from an array in which each member starts with a capital letter, so this should do the trick. I didn't know that preg_match_all returned an array! I am going to have to get my head around the regex (tutorial time). Thanks. – charco Nov 26 '18 at 15:49
0

You could use lookarounds or (*SKIP)(*FAIL). Either usw ,(?! the) or , the(*SKIP)(*FAIL)|, with preg_split().

Tapping on my mobile

Jan
  • 42,290
  • 8
  • 54
  • 79
0

preg_split supports the flag PREG_SPLIT_DELIM_CAPTURE. See the documentation.

The delimiter needs to be in parentheses:

php > var_dump(preg_split('/(, )/', 'Henry the horse, Billy the donkey,
Harry the mule, George, the hippo', -1, PREG_SPLIT_DELIM_CAPTURE));

array(9) {
  [0]=>
  string(15) "Henry the horse"
  [1]=>
  string(2) ", "
  [2]=>
  string(16) "Billy the donkey"
  [3]=>
  string(2) ", "
  [4]=>
  string(14) "Harry the mule"
  [5]=>
  string(2) ", "
  [6]=>
  string(6) "George"
  [7]=>
  string(2) ", "
  [8]=>
  string(9) "the hippo"
}
Crouching Kitten
  • 1,135
  • 12
  • 23