-3

I want to validate different kinds of strings which are of different format like 10JUN2022, 2Mx1D, 4M, 1D, TEN, ONE|TEN etc.. and I have written regular expression for that '''^([0-9A-WYZa-wyz ]+)([xX|]([0-9A-WYZa-wyz ]+))?$''' and it's working fine but I also need to validate one more string 2022-06-10, but the expression is failing.

Ole V.V.
  • 81,772
  • 15
  • 137
  • 161
dilipyadav
  • 53
  • 7
  • 1
    Regex can help you with regular expressions, but when the expressions become a lot and very different each other (like the ones you have here), it soon becomes hard to follow your regular expression. I'd rather suggest you hold a collection of `SimpleDateFormatter`s, each with a different date pattern, and you loop through it when validating a date. It may be less efficient but it's definitely way more flexible and especially more understandable. – Matteo NNZ Jun 16 '22 at 16:14
  • 1
    By the way, the [*ISO 8601*](https://en.m.wikipedia.org/wiki/ISO_8601) standard defines sensible unambiguous date-time formats. I suggest educating the publisher of your data about adopting ISO 8601 to replace this hodgepodge of formats. For example a span of two months and a day would be `P2M1DT`. The tenth of June this year would be `2022-06-10`. – Basil Bourque Jun 16 '22 at 16:19
  • 2
    @MatteoNNZ Good suggestion, except that `SimpleDateFormat` class was years ago supplanted by the modern *java.time* classes, specifically `DateTimeFormatter`. – Basil Bourque Jun 16 '22 at 16:21
  • `^(([0-9A-WYZa-wyz ]+)([xX|]([0-9A-WYZa-wyz ]+))?)|(\d{4}-\d{2}-\d{2})$`? – Ole V.V. Jun 16 '22 at 16:28
  • 1
    I would recommend also basing the validation on the meaning of those strings since it will help the reader of your code understand. I take `2Mx1D` to be 2 months 1 day? What is `TEN`? Nothing to do with date and time? What is `ONE|TEN`? – Ole V.V. Jun 16 '22 at 16:30
  • they are just different string I am validating with the same regex – dilipyadav Jun 16 '22 at 17:00
  • 1
    @BasilBourque you're right, I always confuse the new one with the old one (until when I import it and realize it's the other one I wanted to import) – Matteo NNZ Jun 16 '22 at 21:25

1 Answers1

1

When it comes to regex, don't try to get overly clever. Just solve the basic problem. If that takes multiple regex patterns, so be it. It's much easier to maintain and read.

I would use this for the first regex: [0-3]?\d\w{3}(1|2)\d{3} and this for the second regex: (1|2)\d{3}(-\d{2}){2} or combine them if you must: ([0-3]?\d\w{3}(1|2)\d{3})|((1|2)\d{3}(-\d{2}){2})

Ryan
  • 1,762
  • 6
  • 11