1

I am working on an app that gets text as input and gives Sequence Diagrams as output, something like Web Sequence Diagrams. So, for a default input, like Alice says Hi to Bob, the input is:

Alice -> Bob: Hi

The users can give any kind of inputs. The variations for the above line:

Alice -> Bob : Hi
Alice -> Bob :Hi
Alice -> Bob: Hi
Alice -> Bob:Hi
Alice ->Bob : Hi
Alice ->Bob :Hi
Alice ->Bob: Hi
Alice ->Bob:Hi
Alice-> Bob : Hi
Alice-> Bob :Hi
Alice-> Bob: Hi
Alice-> Bob:Hi
Alice->Bob : Hi
Alice->Bob :Hi
Alice->Bob: Hi
Alice->Bob:Hi

The other variations of the messages include the following arrows:

  • -
  • --
  • ->
  • -->
  • ->>
  • -->>

Even if I want to split the input based on -> and the :, it is difficult as these messages can come in any order. So, if I am giving something like:

Alice --> Bob: Hello
Bob -> Alice: See you!

At first I had restricted the users to give space compulsorily around the arrows. And after that, the count will be 3 and then the third array item is split using :. This is achieved using the below code:

$userInput = array_map('trim', explode("\r\n", trim($input)));
foreach ($userInput as $line) {
    $line = array_filter(array_map('trim', explode(" ", str_replace(array(":", ": "), " ", $line), 4)));

I replace the : with a space and split the string up to four characters with a space as the delimiter. Am I doing rightly, as this doesn't work if the spaces given are like above and when all types of arrows are given by the user, this method doesn't work. Please guide me.

Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252

1 Answers1

2

Try using regular expressions and preg_match (http://www.php.net/preg_match). It will make your life a lot easier.

Regular Expression Pattern:

/(\w+)\s*\-+>{1,2}\s*(\w+)\s*:\s*(\w+)/i

Breakdown:

(\w+)  <- Match and return 1 or more characters
\s*    <- Match 0 or more white space characters
(\-+>{1,2}) <- Match and return 1 or more "-" characters followed by 1 or 2 ">" characters

Source:

<?php
foreach ($userInput as $line) {
    $matches = array();
    preg_match('/(\w+)\s*(\-+>{1,2})\s*(\w+)\s*:\s*(\w+)/i', $line, $matches);
    echo $matches[1] . "\n"; // Alive
    echo $matches[2] . "\n"; // --> or -> or --->
    echo $matches[3] . "\n"; // Bob
    echo $matches[4] . "\n"; // Hi
}
Tom
  • 3,031
  • 1
  • 25
  • 33
  • *"@Tom I don't wanna use regex and complicate things. :) – Praveen Kumar"* - Says he doesn't want to use regex under comments. – Funk Forty Niner Jan 05 '15 at 21:20
  • @Fred-ii- his reasoning for not using regex is to prevent complicating his code. In this case I believe more complications will arise from text munging with `str_replace` and `explode`'s. This is a perfect use case for regex. He is welcome to not pick it as his answer if he wishes, but it's still a valid option. – Tom Jan 05 '15 at 21:24
  • @Tom Lemme check this. If it works, fine. There's no other way to do this, so I will use this!!! `:D` – Praveen Kumar Purushothaman Jan 05 '15 at 21:38
  • @Tom, I also need the type of the arrow in this case. How do I get that! Oops... `:(` – Praveen Kumar Purushothaman Jan 05 '15 at 21:39
  • @Fred-ii- No worries. Anything, that works is fine. `:D` – Praveen Kumar Purushothaman Jan 05 '15 at 21:41
  • @Tom There could be `-->>` kind of arrows with two `>`s too. How to find them? – Praveen Kumar Purushothaman Jan 08 '15 at 11:48
  • Updated. If you need to have more than 1 or 2 items, replace `{1,2}` (which means between 1 and 2) with `+` (which means 1 or more in regex) – Tom Jan 08 '15 at 13:12
  • @Tom Sorry to bother you again. But I am getting only the first word of the message. Can you help me to get the rest of the sentence? Cheers. – Praveen Kumar Purushothaman Jan 12 '15 at 20:30
  • Try this: `/([^\-]+)\-+>{1,2}([^:]+):(.*)/i`. It's simpler, but you might need to trim your the outputs to remove white-spaces. `(^\-]+)` means return 1 or more elements you encounter until you hit a `-` character and `([^:]+)` the same except for the `:` character. – Tom Jan 13 '15 at 12:32