-2

I have a string that looks like this:

aaaaa: lorem ipsum bb: dolor sit amet ccc: no pro movet

What would be the best way to split the string into an array and get the following result in PHP?

array[0]='aaaaa: lorem ipsum';
array[1]='bb: dolor sit amet';
array[2]='ccc: no pro movet';

I can write a function that finds the position of each ":", finds the length of the word before it, and splits the string. But I guess there is an easier way using regular expressions?

Toto
  • 89,455
  • 62
  • 89
  • 125
johnohod
  • 494
  • 5
  • 19
  • 4
    while a regexp will help you here, you should ask yourself if you shouldn't fix the design problem that led you there – Burki Jun 23 '17 at 08:00
  • Which regex have you tried so far? – Julien Lachal Jun 23 '17 at 08:00
  • @Burki, true, but I get the string from an external system so I have to handle it somehow. I haven't actually tried any regex yet, I'm not so experienced with them. – johnohod Jun 23 '17 at 08:05
  • 1
    @johnohod Saying that you are not experienced, doesn't mean that you cannot try, at least. Read a bit. That would help you learn. Use `preg_match_all` and the following pattern: `([a-z]+:)` that should be a good start, imo. – Ivanka Todorova Jun 23 '17 at 08:07

2 Answers2

3

For this kind of job, I'll use preg_match_all:

$str = 'aaaaa: lorem ipsum bb: dolor sit amet ccc: no pro movet';
preg_match_all('/\S+:.+?(?=\S+:|$)/', $str, $m);
print_r($m);

Output:

Array
(
    [0] => Array
        (
            [0] => aaaaa: lorem ipsum 
            [1] => bb: dolor sit amet 
            [2] => ccc: no pro movet
        )

)

Explanation:

\S+:        : 1 or more NON space followed by colon
.+?         : 1 or more any character not greedy
(?=\S+:|$)  : lookahead, make sure we have 1 or more NON space followed by colon or end of string
Toto
  • 89,455
  • 62
  • 89
  • 125
0

Your desired 1-dim array can be directly achieved with preg_split() as requested. preg_split() is a better choice for this task versus preg_match_all because the only unwanted characters are the delimiting spaces. preg_match_all() creates a more complexe array structure than you need, so there is the extra step of accessing the first subarray.

My pattern will split the string on every space that is followed by one or more lowercase letters, then a colon.

Code: (Demo)

$string = 'aaaaa: lorem ipsum bb: dolor sit amet ccc: no pro movet';
var_export(preg_split('/ (?=[a-z]+:)/', $string));

Output:

array (
  0 => 'aaaaa: lorem ipsum',
  1 => 'bb: dolor sit amet',
  2 => 'ccc: no pro movet',
)
mickmackusa
  • 43,625
  • 12
  • 83
  • 136