2

I have a file formatted as...

file.txt

[sectionone]
...
...
[sectiontwo]
...
...
[sectionthree]
...
...

The format is very similar to (for those familiar) smb.conf and I was hoping to have an array "section" strings by the end of it. In the end I'm looking to do a preg_split to take each section of text and put in into an array like so...

Array
(
    [0] => [sectionone]
           ...
           ...
    [1] => [sectiontwo]
           ...
           ...
    [2] => [sectionthree]
           ...
           ...
)

I know I could read the file line by line and create a solution that way but I'm stubborn as hell and trying to figure this out as it suits my needs. The split must occur when a '[' (bracket) is at the beginning of any line and anything leading up to the next bracket (newlines, tabs, any characters, etc) is fair game. Most of my attempts have either resulted in nothing or an array count of 1 with EVERYTHING.

 $fileString = file_get_contents( '/tmp/file.txt' );
 print_r( preg_split( "/^\[.*\]\n$/", $fileString );

...results in the undesired...

Array
(
    [0] => [sectionone]
           ...
           ...
           [sectiontwo]
           ...
           ...
           [sectionthree]
           ...
           ...
}

Any help would be greatly appreciated as my regex skills are beginner at best. Thanks in advance.

Community
  • 1
  • 1
Evan
  • 63
  • 5

3 Answers3

2

Please consider using the parse_ini_file() or the parse_ini_string() function, which already parses a file in the same format as smb.conf into an array with the configuration items.

For example, given the following config sample.ini (example from parse_ini_file() docs):

[first_section]
one = 1
five = 5
animal = BIRD

[second_section]
path = "/usr/local/bin"
URL = "http://www.example.com/~username"

The following code:

$ini_array = parse_ini_file("sample.ini", true);
print_r($ini_array);

will produce:

Array
(
    [first_section] => Array
        (
            [one] => 1
            [five] => 5
            [animal] => Dodo bird
        )

    [second_section] => Array
        (
            [path] => /usr/local/bin
            [URL] => http://www.example.com/~username
        )
)
Elias Dorneles
  • 22,556
  • 11
  • 85
  • 107
  • First off I appreciate the response. Already went down that road and it requires a very strict format. In my case, unfortunately, between the section headers can be any number of horrid text, special characters, etc... not just x equals y. All I know for sure is the sections will start with the bracket. – Evan Dec 16 '13 at 17:26
  • Good suggestion but he said it is *similar* so if it's not the same then he would have to change his format so that it is the *exact same*. – bluegman991 Dec 16 '13 at 17:26
  • @Evan Ouch! Yeah, in that case you're better with a custom parser. – Elias Dorneles Dec 16 '13 at 17:29
  • @bluegman991 Yes, I did notice that, that's why I said to consider it -- the question wasn't clear if it had been tried. =) – Elias Dorneles Dec 16 '13 at 17:31
  • @Evan Try using: `preg_split("/^\[[^[]+\]\n$/", $fileString )` -- I think the `.*` is matching the last `]` greedily. – Elias Dorneles Dec 16 '13 at 17:35
2

Remove the ^ and the $ from your regex.

This is causing the php to only match an opening bracket at the beginning of the string and a closing bracket at the end of the string.

$fileString = file_get_contents( '/tmp/file.txt' );
print_r( preg_split( "/\[.*\]\r?\n/", $fileString );

Something like this should work better for you.

bluegman991
  • 707
  • 4
  • 9
  • Totally works, however, the headers themselves are missing. I do appreciate the solution though as I may use it elsewhere. – Evan Dec 16 '13 at 18:12
2

You could perhaps use preg_match_all instead?

$fileString = '[sectionone]
...
...
[sectiontwo]
...
...
[sectionthree]
...
...';
preg_match_all("/^\[.*?(?=\n\[|\z)/ms", $fileString, $matches);
print_r($matches);

This will match [ till it finds a \n followed by a [ or at the end of the string. The flags ms are important here to make ^ match the beginning of all lines and for . to match newlines.

Or with splitting...

print_r(preg_split("/\n(?=\[)/", $fileString));

This will match a \n only if followed by a [.

Jerry
  • 70,495
  • 13
  • 100
  • 144