2
$str = "[10:42-23:10]part1[11:30-13:20]part2"

I wish to split it into something like:

[1] 10:42-23:10
[2] part1
[3] 11:30-13:20
[4] part2

The best I managed to come up with is:

$parts = preg_split("/(\\[*\\])\w+/", $str );

But this returns

[0] => [10:42-23:10
[1] => [11:30-13:20
[2] =>
Mohammad
  • 21,175
  • 15
  • 55
  • 84
gilad s
  • 475
  • 8
  • 16

4 Answers4

3

Split on alternative between [ and ], and use the flag PREG_SPLIT_NO_EMPTY to not catch empty parts.

$str = "[10:42-23:10]part1[11:30-13:20]part2";
$parts = preg_split("/\[|\]/", $str, -1, PREG_SPLIT_NO_EMPTY );
print_r($parts);

Output:

Array
(
    [0] => 10:42-23:10
    [1] => part1
    [2] => 11:30-13:20
    [3] => part2
)

NB.

Thank to @WiktorStribiżew , his regex /[][]/ is much more efficient, I've some benchmark, it is about 40% faster.

$str = "[10:42-23:10]part1[11:30-13:20]part2";
$parts = preg_split("/[][]/", $str, -1, PREG_SPLIT_NO_EMPTY );
print_r($parts);

Here is the perl script I have used to do the benchmark:

#!/usr/bin/perl
use Benchmark qw(:all);

my $str = "[10:42-23:10]part1[11:30-13:20]part2";

my $count = -5;
cmpthese($count, {
    '[][]' => sub {
        my @parts = split(/[][]/, $str);
    },
    '\[|\]' => sub {
        my @parts = split(/\[|\]/, $str);
    },
});

Result: (2 runs)

>perl -w benchmark.pl
          Rate \[|\]  [][]
\[|\] 536640/s    --  -40%
[][]  891396/s   66%    --
>Exit code: 0

>perl -w benchmark.pl
          Rate \[|\]  [][]
\[|\] 530867/s    --  -40%
[][]  885242/s   67%    --
>Exit code: 0
Toto
  • 89,455
  • 62
  • 89
  • 125
  • While the pattern is simple, you may write it even better, as `"/[][]/"` – Wiktor Stribiżew Feb 05 '17 at 14:22
  • @WiktorStribiżew:True, but my regex appears to me more readable. It's only a point of vue – Toto Feb 05 '17 at 14:29
  • Sure, it is always a matter of taste/style. I prefer precision and performance to readability. – Wiktor Stribiżew Feb 05 '17 at 14:31
  • @WiktorStribiżew: Mea culpa, I've just some benchmark between the two regex, yours is much more efficient, something about 40% !!! I'll edit my answer. – Toto Feb 05 '17 at 14:35
  • @Toto: I'm curious to see how you bench that! – Casimir et Hippolyte Feb 05 '17 at 16:26
  • @CasimiretHippolyte: I've done in perl with `Benchmark` package. – Toto Feb 05 '17 at 16:51
  • @Toto: with PHP results are similar, the alternation is sometimes a little faster with the example string (a string where the delimiters are relatively frequent), with a string with long parts without delimiters the character class is a little faster. Whatever the case the differences are ridiculous, so in practise... – Casimir et Hippolyte Feb 05 '17 at 17:01
  • @CasimiretHippolyte: Sure, in practice it doesn't make difference, it's only 1 or 2 µs per match. – Toto Feb 05 '17 at 17:52
  • Note that it's also possible with pcre to use the STUDY modifier `~]|\[~S` that may improve this kind of pattern (the result of this modifier is sometime random, depending if informations to improve the search are found or not, but alternations starting with literal characters give good results in general). – Casimir et Hippolyte Feb 05 '17 at 19:57
3

Also you can use regex in preg_match_all() instead of preg_split()

$str = "[10:42-23:10]part1[11:30-13:20]part2";
preg_match_all("/[^\[\]]+/", $str, $parts);
print_r($parts[0]);

See result in demo

Mohammad
  • 21,175
  • 15
  • 55
  • 84
2

Use a simple regex to match any [...] substring (\[[^][]*]) and wrap the whole pattern with a capturing group - then you can use it with preg_split and PREG_SPLIT_DELIM_CAPTURE flag to get both the captures and the substrings in between matches:

$re = '/(\[[^][]*])/';
$str = '[10:42-23:10]part1[11:30-13:20]part2';
$matches = preg_split($re, $str, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE);
print_r($matches);

See the PHP demo

With this approach, you may have a better control of what you match inside square brackets, as you may adjust the pattern to only match time ranges, e.g.

(\[\d{2}:\d{2}-\d{2}:\d{2}])

A [10:42-23:10]part1[11:30-13:20]part2[4][5] will get split into [10:42-23:10], part1, [11:30-13:20] and part2[4][5] (note the [4][5] are not split out).

See this regex demo

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
1

Without regex, you can use strtok:

$result = [];
$tok = strtok($str, '[]');
do {
    if (!empty($tok))
        $result[] = $tok;
} while (false !== $tok = strtok('[]'));
Casimir et Hippolyte
  • 88,009
  • 5
  • 94
  • 125