2

I have written the following regex

(1[012]|[1-9])(am|pm)\-(1[012]|[1-9])(am|pm)

to match following kind of time formats:

7am-10pm (matches correctly and creates 4 match groups 7, am, 10, pm)

13am-10pm (this should not be matched, however it matches and creates 4 match groups 3, am, 10, pm)

10pm (this doesn't match as expected because it doesn't specify the time range end)

111am-10pm (this should not be matched, however it matches and creates 4 match groups 11, am, 10, pm)

How can I improve my regex such that I don't need to repeat the digits and am/pm pattern and also following things:

  1. it captures only the time range components like in 7am-10am there should be only 2 match groups 7am, 10am.

  2. it matches only proper hours for e.g. 111am or 13pm etc should be considered a no-match.

  3. I don't know if its possible to with a regex but can we make the regex match correct time ranges for e.g. 7am-1pm should match, however 4pm-1pm should be considered as no match?

Note: I am using Ruby 2.2.1

Thanks.

Jignesh Gohel
  • 6,236
  • 6
  • 53
  • 89

2 Answers2

1

You are missing ^ (start of the line) in your regex and thats why it is matching from between.

You have to use:

^(1[012]|[1-9])(am|pm)\-(1[012]|[1-9])(am|pm)

Better solution: You can also use \b (boundary) if your pattern doesn't always start from new line.

\b(1[012]|[1-9])(am|pm)\-(1[012]|[1-9])(am|pm)\b

See DEMO.

karthik manchala
  • 13,492
  • 1
  • 31
  • 55
1

First let's see what you did wrong :

13am-10pm (this should not be matched, however it matches and creates 4 match groups 3, am, 10, pm)

it matches only proper hours for e.g. 111am or 13pm etc should be considered a no-match.

This matches, since you allow to match a single digit [1-9] here : (1[012]|[1-9]).

In order to fix this, you should either allow one [1-9] digit, or 1 + [0-2]. Since we do not know when the regex starts we 'll use some word boundary to be sure we have a "word start".

Since you do not want to capture the numbers but the whole time plus the am|pm you can use a non capturing group :

\b((?:1[0-2]|[1-9])

Then it's simply a matter of repeating ourselves and adding a dash :

\b((?:1[0-2]|[1-9])[ap]m)-((?:1[0-2]|[1-9])[ap]m)

Regarding point 3. Well, yes you could do this with a regex, but you are better off by simply adding a logical check once you get group 1 and 2 to see if the time range really makes sense.

All in all this is what you get :

# \b((?:1[0-2]|[1-9])[ap]m)-((?:1[0-2]|[1-9])[ap]m)
# 
# 
# Assert position at a word boundary «\b»
# Match the regular expression below and capture its match into backreference number 1 «((?:1[0-2]|[1-9])[ap]m)»
#    Match the regular expression below «(?:1[0-2]|[1-9])»
#       Match either the regular expression below (attempting the next alternative only if this one fails) «1[0-2]»
#          Match the character “1” literally «1»
#          Match a single character in the range between “0” and “2” «[0-2]»
#       Or match regular expression number 2 below (the entire group fails if this one fails to match) «[1-9]»
#          Match a single character in the range between “1” and “9” «[1-9]»
#    Match a single character present in the list “ap” «[ap]»
#    Match the character “m” literally «m»
# Match the character “-” literally «-»
# Match the regular expression below and capture its match into backreference number 2 «((?:1[0-2]|[1-9])[ap]m)»
#    Match the regular expression below «(?:1[0-2]|[1-9])»
#       Match either the regular expression below (attempting the next alternative only if this one fails) «1[0-2]»
#          Match the character “1” literally «1»
#          Match a single character in the range between “0” and “2” «[0-2]»
#       Or match regular expression number 2 below (the entire group fails if this one fails to match) «[1-9]»
#          Match a single character in the range between “1” and “9” «[1-9]»
#    Match a single character present in the list “ap” «[ap]»
#    Match the character “m” literally «m»
FailedDev
  • 26,680
  • 9
  • 53
  • 73