-1

I am trying to split and assign the values into an array for a text like this:

Title: Wonderful World

----

Text: 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Sed facilisis nulla dui, etiaculis enim porta aliquet. 
Etiam ante mauris, luctus non ultricies ut, pellentesque non eros. 

<b>Pellentesque</b> sit amet eros in quam pharetra fermentum quis ac lacus.
 Maecenas turpis purus, molestie eu quam non, adipiscing hendrerit nibh.

Go to <a href="/">Main Site</a>

----

Image: mysite.com/images/logo.png

After splitting and parsing it should be equivalent to PHP array like:

array (
    'Title' => "Wonderful World",
    'Text' => "Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
              Sed facilisis nulla dui, etiaculis enim porta aliquet. 
              Etiam ante mauris, luctus non ultricies ut, pellentesque non eros. 

              <b>Pellentesque</b> sit amet eros in quam pharetra fermentum quis ac lacus.
               Maecenas turpis purus, molestie eu quam non, adipiscing hendrerit nibh.

              Go to <a href="http://google.com/>Google</a>",
    'Image' => "Image: mysite.com/images/logo.png"
);

So basically what it will do is:

  1. Split the text by 4 dashes ----, so it will ignore if it's less than or more than 4 dashes. If possible, also ignore 4 dashing following by other characters like

    ----xxxx
    

    while

    ---- xxxx
    

    should work (followed by a space or line break).

  2. Create array key for the first word followed by the first colon

  3. Create array value that comes after a word followed by column until it's end of file or meets another ----

  4. It should preserve the HTML tags and lines

  5. If there is only one keyword with colon even without 4 dash separator, it will still assigned to an array with sing element, so if the text contains:

    Title: Wonderful World

    will still create

    array (
        'Title' => "Wonderful World"
    );
    
  6. It should be intelligent enough to ignore the spaces between the keyword and colon, so the following 3 examples will be treated the same way:

    Title: Wonderful World

    Title :Wonderful World

    Title : Wonderful World

    and still able to create the array like

    array (
        'Title' => "Wonderful World"
    );
    

I have looked into YAML, but it's not ideal for standard text input. Do you know any PHP library or how I can pull this off? Thank you.

user702300
  • 1,211
  • 5
  • 22
  • 32

2 Answers2

1

Try this:

<?php
$src = <<<END_SRC
Title: Wonderful World

----

Text: 
Lorem ipsum dolor sit amet, consectetur adipiscing elit. 
Sed facilisis nulla dui, etiaculis enim porta aliquet. 
Etiam ante mauris, luctus non ultricies ut, pellentesque non eros. 

<b>Pellentesque</b> sit amet eros in quam pharetra fermentum quis ac lacus.
 Maecenas turpis purus, molestie eu quam non, adipiscing hendrerit nibh.

Go to <a href="/">Main Site</a>

----

Image: mysite.com/images/logo.png
END_SRC;

$a = preg_split('/----\s/',$src);

$data = array();
foreach ($a as $part) {
    list ($key,$value) = explode(':',$part,2);
    $key = trim($key);
    $value = trim($value);
    if (isset($data[$key])) $data[$key] .= "\n\n$value";
    else $data[$key] = $value;
}

print_r($data);

?>
r3mainer
  • 23,981
  • 3
  • 51
  • 88
0

Here you go:

$text = 'Title: Wonderful World

----

Text: 
......';

$result = array();

$block_sep = PHP_EOL.'----'.PHP_EOL;

foreach(explode($block_sep, $text) as $block){
    $block = explode(':', $block, 2);
    $result[trim($block[0])] = trim($block[1]);
}

print_r($result);

I'm doing some assumptions though:

  • The four-dashes marker must not end or start with whitespace (or any other character really).
  • Every "block" will contain a key (the text before the colon) and a value (the text after)
  • multiple keys in the text will overwrite existing keys in $result

You can see it working here: http://3v4l.org/T84dd

PS: In theory you can get this working with just one regular expression (and preg_match_all()).

Christian
  • 27,509
  • 17
  • 111
  • 155
  • It didn't work, I think it was due to PHP_EOL in the beginning. The solution giving by squeamish worked. Really appreciate it! – user702300 Dec 02 '13 at 01:19