2

I have a string like

BK0001 My book (4th Edition) $49.95 (Clearance Price!)

I would like a way to split it into different parts like

[BK0001] 
[My Book (4th Edition)] 
[$49.95] 
[(Clearance Price!)]

I'm pretty new at regex and I'm using this to parse a line on a file. I managed to get the first part BK0001 by using

$parts = preg_split('/\s+/', 'BK0001 My book (4th Edition) $49.95 (Clearance Price!)';

then getting the $part[0] value but not sure on how to split it to get the other values.

Mohammad
  • 21,175
  • 15
  • 55
  • 84
answerSeeker
  • 2,692
  • 4
  • 38
  • 76
  • 1
    have you used regex101 yet? Great resource for both learning regexes and developing for a particular need . – erik258 Nov 07 '18 at 20:42
  • 2
    Try spelling out the subpatterns. Say, `preg_match('~^(?\S+)\s+(?.*?)\s+(\$\d[\d.]*)\s*(?
    .*)$~', $text, $matches)`, see [demo](https://regex101.com/r/EF0I6W/1).
    – Wiktor Stribiżew Nov 07 '18 at 20:43
  • @Dan Farrel I have but I don't use php and regex often, I code mostly in python and usually use string.split() for tasks such as these. This is one of those rare moments when I need regex and investing time learning it fully really a good option right now. – answerSeeker Nov 07 '18 at 20:48
  • @WiktorStribiżew works perfectly. Thanks – answerSeeker Nov 07 '18 at 20:48
  • `learning it fully really a good option right now` it's always good to learn Regex, most languages have some flavor of it and it's incredibly powerful and useful. – ArtisticPhoenix Nov 07 '18 at 21:07

2 Answers2

3

You may match the specific parts of the input string using a single pattern with capturing groups:

preg_match('~^(?<code>\S+)\s+(?<name>.*?)\s+(?<num>\$\d[\d.]*)\s*(?<details>.*)$~', $text, $matches)

See the regex demo. Actually, the last $ is not required, it is there just to show the whole string is matched.

Details

  • ^ - start of a string
  • (?<code>\S+) - Group "code": one or more non-whitespace chars
  • \s+ - 1+ whitespaces
  • (?<name>.*?) - Group "name": any 0+ chars other than line break chars, as few as possible
  • \s+ - 1+ whitespaces
  • (?<num>\$\d[\d.]*) - Group "num": a $, then 1 digit and then 0+ digits or .
  • \s* - 0+ whitespaces
  • (?<details>.*) - Group "details": any 0+ chars other than line break chars, as many as possible
  • $ - end of string.

PHP code:

$re = '~^(?<code>\S+)\s+(?<name>.*?)\s+(?<num>\$\d[\d.]*)\s*(?<details>.*)$~';
$str = 'BK0001 My book (4th Edition) $49.95 (Clearance Price!)';
if (preg_match($re, $str, $m)) {
    echo "Code: " . $m["code"] . "\nName: " . $m["name"] . "\nPrice: " .
         $m["num"] . "\nDetails: " . $m["details"]; 
}

Output:

Code: BK0001
Name: My book (4th Edition)
Price: $49.95
Details: (Clearance Price!)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
3

Try using preg_match

$book_text = "BK0001 My book (4th Edition) $49.95 (Clearance Price!)";
if(preg_match("/([\w\d]+)\s+(.*?)\s+\\((.*?)\\)\s+(\\$[\d\.]+)\s+\\((.*?)\\)$/",$book_text,$matches)) {
    //Write code here
    print_r($matches);
}

$matches[0] is reserved for the full match string. You can find the split parts from $matches[1]...

Array ( [0] => BK0001 My book (4th Edition) $49.95 (Clearance Price!) [1] => BK0001 [2] => My book [3] => 4th Edition [4] => $49.95 [5] => Clearance Price! )

$matches[1] is "book number"
$matches[2] is "book name"
$matches[3] is "edition"
$matches[4] is "price"
$matches[5] is "special text"
Ravi Rajendra
  • 688
  • 4
  • 11