0

In short, I'm trying to match the longest item furthest right in a string that fits this pattern:

[0-9][0-9\s]*(\.|,)\s*[0-9]\s*[0-9]

Consider, for example, the string "abc 1.5 28.00". I want to match "5 28.00".

Using the pattern "as-is", like so

preg_match_all('/[0-9][0-9\s]*(\.|,)\s*[0-9]\s*[0-9]/', 'abc 1.5 28.00', $result);

we instead get the following matches:

[0] => 1.5 2
[1] => 8.00

No "5 28.00" or "28.00" for that matter, for obvious reasons.

I did some research and people suggested using positive lookahead for problems like this. So I tried the following

preg_match_all('/(?=([0-9][0-9\s]*(\.|,)\s*[0-9]\s*[0-9]))/', 'abc 1.5 28.00', $result);

giving us these matches:

[0] => 1.5 2
[1] => 5 28.00
[2] => 28.00
[3] => 8.00

Now, "5 28.00" is in there which is good, but it can't be reliably identified as the correct match (e.g. you can't just traverse from the end looking for the longest match, because there could be a longer match that appeared earlier in the string). Ideally, I'd want those sub-matches at the end (indexes 2 and 3) to not be there so we can just grab the last index.

Does anyone have ideas for how to accomplish exactly what I need in the simplest/best way possible? Let me know if I need to clarify anything as I know this stuff can get confusing, and many thanks in advance.

**Edit: some additional input/match examples

"abc 1.5 28.00999" => "5 28.00" (i.e. can't match end of string, $)

"abc 500000.05.00" => "5.00"

G.S.
  • 623
  • 6
  • 22

2 Answers2

1

Your problem is easily fixed by ensuring you match on the end of the input string by adding a dollar sign:

preg_match_all('/[0-9][0-9\s]*(\.|,)\s*[0-9]\s*[0-9]$/', 
               'abc 1.5 28.00', $result);

Returns:

array (size=2)
  0 => 
    array (size=1)
      0 => string '5 28.00' (length=7)
  1 => 
    array (size=1)
      0 => string '.' (length=1)

Now I'm not entirely sure why you wrapped the dot in parentheses, but this output is correct for your question as far as I can see, and implements the "farthest to the right" requirement.

Niels Keurentjes
  • 41,402
  • 9
  • 98
  • 136
  • See my latest edit. Unfortunately we can't rely on the desired match being end of string – G.S. Apr 12 '13 at 16:21
  • Unfortunately your edit only makes the criteria less clear. For "abc 1.5 28.00999" => "5 28.00" I have no clue how it should decide on that result. You really need to clarify the required pattern. – Niels Keurentjes Apr 12 '13 at 16:24
  • What is unclear? On the most basic level, what I want is the match that fits the pattern at the very top of my original post and occurs rightmost in the search string. The complicating caveat is that I want the longest such string, so in the example I gave, whereas [3] => 8.00 is the rightmost match, [1] => 5 28.00 is the longest and rightmost – G.S. Apr 12 '13 at 16:27
  • The problem is that the example matches far more strings than you can account for. I could keep churning out regexps that actually match a new example, but you'd probably be able to find another example that wouldn't work. Try to describe the pattern instead of giving a limited set of examples. – Niels Keurentjes Apr 12 '13 at 16:31
1

The nearest match I can get for you is the following

((?:\d\s*)+[.,](?:\s*\d){2})(?:(?![.,](?:\s*\d){2}).)*$

And produces the following output (look at '1' in each case)...

'abc 1.5 28.00999' => array (
  0 => '5 28.00999',
  1 => '5 28.00',
)
'abc 500000.05.00' => array (
  0 => '05.00',
  1 => '05.00',
)
'abc 111.5 8.0c 6' => array (
  0 => '111.5 8.0c 6',
  1 => '111.5 8',
)
'abc 500000.05.0a0' => array (
  0 => '500000.05.0a0',
  1 => '500000.05',
)
'abc 1.5 28.00999 6  0 0.6 6' => array (
  0 => '00999 6  0 0.6 6',
  1 => '00999 6  0 0.6 6',
)
Phill Sparks
  • 20,000
  • 3
  • 33
  • 46
  • So close! It's seems to work pretty much across the board, except for a case like this: "abc 111.5 8.0c 6". Here, it should match 111.58 but doesn't match anything. (If you remove that errant "c" it correctly matches 58.06). – G.S. Apr 12 '13 at 17:10
  • @G.Moore ok, how about now? Any more test cases? – Phill Sparks Apr 12 '13 at 22:56
  • I think this is good! Thank you so much for the help on this, really appreciate it. I'll come back here if I find a case that's not working, but it should be good. – G.S. Apr 13 '13 at 16:19