3

I need to replace a number with a decimal point with a colon. So in short:

Input

Pick me up at 5.50 and take me to the zoo

Required

Pick me up at 5:50 and take me to the zoo


However, I don't want to replace a number with just a period:

Input

Pick me up at 5. Take me to the zoo.

Required

Pick me up at 5. Take me to the zoo.

I could do this by brute force, but I believe a regex is the best solution here, and inevitably as I am not a regex expert, I have come up against Zawinski's Law.

I can match the number, but I am stuck on how to do the replacement.

\d\.\d+

enter image description here

I believe I need to use lookahead and/or grouping, but I'm not familiar with the syntax.

I've found previous questions like Replace dot(.) with comma(,) using RegEx? and Remove decimal point when not between two digits but the first advises not using a regex at all, while the second seems relevant but I can't figure out how to change the answer to get the match, let alone how to replace it.

stuartd
  • 70,509
  • 14
  • 132
  • 163
  • Try following : string input = "Pick me up at 5.50 and take me to the zoo"; string pattern = @"(?'integer'\d+).(?'fraction'\d+)"; string output = Regex.Replace(input, pattern, "${integer}:${fraction}"); – jdweng Oct 26 '17 at 10:56

2 Answers2

4

To continue with your current solution, you need to make sure the digits on both ends of the dot can be left intact after the replacement, but still be accounted (checked) for when matching.

There are 2 ways to achieve that:

  • Using capturing groups + backreferences in the replacement
  • Using lookarounds

Here is the first approach with capturing groups:

Regex.Replace(s, @"(\d)\.(\d+)", "$1:$2")

And the other with lookarounds:

Regex.Replace(s, @"(?<=\d)\.(?=\d)", ":")

See the regex demo 1 and regex demo 2.

Details

  • (\d) - Group 1 (later referred to with the help of the $1 replacement backreference): any one Unicode digit or
  • (?<=\d) - a positive lookbehind that requires a digit immediately to the left of the current location
  • \. - a dot
  • (\d+) - Group 2 (later referred to with the help of the $2 replacement backreference): any 1 or more Unicode digits or
  • (?=\d) - a positive lookahead that requires a digit immediately to the right of the current location

You might go on tweaking it to only match the strings inside word boundaries with a limiting quantifier on the second \d to match only 2 digits:

\b(\d)\.(\d{2})\b

or

(?<=\b\d)\.(?=\d{2}\b)

See another regex demo (and Version 2 regex with lookarounds). Note that sometimes further word boundary adjustment is necessary.

To make sure you only match ASCII digits, use the RegexOptions.ECMAScript option. Or replace all \d with [0-9].

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • I wouldn't say "need" capturing groups. It could just as well be done with look-arounds as OP is touching on. [I.e `(?<=\d)\.(?=\d+)` as seen here](http://regexstorm.net/tester?p=%28%3f%3c%3d%5cd%29%5c.%28%3f%3d%5cd%2b%29&i=Pick+me+up+at+5.50+and+take+me+to+the+zoo%0d%0aPick+me+up+at+5.+Take+me+to+the+zoo.&r=%3a). – SamWhan Oct 26 '17 at 11:04
  • 1
    @ClasG There are lots of ways to do the same thing. [Some consider lookarounds to be too "complex"](https://stackoverflow.com/questions/46897339/using-ruby-scan-split-join-with-regex/46897412#comment80744117_46897412) and blame me for using them. You blame for not using them. Using capturing groups here, in this scenario, is only natural, in my opinion. – Wiktor Stribiżew Oct 26 '17 at 11:08
  • I'm not blaming you. Your's is an excellent solution. I'm just teasing you for the use of the expression "*need to use capturing groups*". – SamWhan Oct 26 '17 at 11:18
  • @ClasG I changed the wording and added an alternative. – Wiktor Stribiżew Oct 27 '17 at 11:30
  • As I hinted earlier, I wasn't to *serious* in my criticism, and didn't expect you to change anything. However, with your addition comes an error (or shortcoming). What if want to go to the zoo after 9.59, or in a country with 24 hour clock? ;) – SamWhan Oct 27 '17 at 11:44
  • 1
    @ClasG [There is no issue](http://regexstorm.net/tester?p=%28%3f%3c%3d%5cd%29%5c.%28%3f%3d%5cd%29&i=Pick+me+up+at+10.50%2c+18.00+and+take+me+to+the+zoo%0d%0aPick+me+up+at+10.+Take+me+to+the+zoo.&r=%3a). Note that Warsaw zoo is open till 17:00, but ticket offices close an hour earlier. – Wiktor Stribiżew Oct 27 '17 at 11:51
  • LOL No issue there, but that's not the regex in your answer. [This is](http://regexstorm.net/tester?p=%28%3f%3c%3d%5cb%5cd%29%5c.%28%3f%3d%5cd%7b2%7d%5cb%29&i=Pick+me+up+at+10.50%2c+18.00+and+take+me+to+the+zoo%0d%0aPick+me+up+at+10.+Take+me+to+the+zoo.&r=%3a) – SamWhan Oct 27 '17 at 12:11
  • @ClasG That is not the main solution, see the beginning for the main one. The word boundary is a suggestion based on the original regex where only `\d` was used as the first digit matching pattern. If OP meant to match a single digit, then there must be some boundary. – Wiktor Stribiżew Oct 27 '17 at 12:16
1

For 24h format, you could use this regex:

/([0-1]?[0-9]|[2][0-3]).[0-5]?[0-9]/

Basically this matches:

Hours:

  • 5, 05 etc (meaning 5 AM)
  • 12, 15 etc (for PM)
  • 20, 21, 22, 23 but not 24 or 50 (for PM)

Minutes:

  • 5 for minutes without leading zero
  • 05 for leading zeros
  • all double-digit numbers that can be minutes in an hour (00-59)
  • but not the non-minute numbers minutes like 60, 89 etc..

To make sure you do it only when the valid time is entered. It's closest you can get. Match entire regex, replace . with : and put it back.

DanteTheSmith
  • 2,937
  • 1
  • 16
  • 31
  • I did use this in the end, but I had to escape the dot - `\b([0-1]?[0-9]|[2][0-3])\.([0-5][0-9])(am|pm)?\b` - cheers – stuartd Dec 12 '17 at 11:37