1

I have text in a column like /AB/25MAR92/ and /AB/25MAR1992/. I am trying to extract just 25MAR92 and 25MAR1992 from the column for a date calculation that I have to work on. Can you please help with the REGEXP_SUBSTR function for this issue? Thanks!

Jude92
  • 167
  • 2
  • 6
  • 20
  • the data format format for the day of the month (2 or 02) doesn't matter as I just need the year.2 digits (Mar99) in some cases and 4 digits in some cases (Mar1999) – Jude92 May 02 '18 at 23:07

2 Answers2

2

You could try:

\b\d{1,2}[A-Z]{3}\d{2,4}\b

but this will also match 02MAR992. To exclude this possibility use:

\b\d{1,2}[A-Z]{3}(?:\d{2}|\d{4})\b

This will match 02MAR1992 and02MAR92 but will not match02MAR992.

builder-7000
  • 7,131
  • 3
  • 19
  • 43
1

I suggest using a pattern like this:

\/(\d{2}[A-Z]{3}(19|20)?\d{2})\/
  • Years are limited to 1900-2099.

Demo

  • If you do not want to allow any 2-digit value for the day \d{2},
    you could add this pattern instead (0[1-9]|[12][0-9]|3[01]) that matches 01-31;

    \/((0[1-9]|[12][0-9]|3[01])[A-Z]{3}(19|20)?\d{2})\/

  • Or if you allow dates like /AB/2MAR92/ that have days without a leading zero
    add (0[1-9]|[12][0-9]|3[01]|[1-9]) instead:

    \/((0[1-9]|[12][0-9]|3[01]|[1-9])[A-Z]{3}(19|20)?\d{2})\/

I've used / as anchors. If you don't like that, you can use \b.


In reaction to your latest comments, my recommended pattern looks like this:

\b\d{1,2}[A-Z]{3}(?:19|20)?\d{2}\b
wp78de
  • 18,207
  • 7
  • 43
  • 71