1

I have field with different text entered with a 13 or 17 Digit ID.Need to extract that ID from this field

 regexp_substr(TXT,'CTRL ACDV\\s+(\\d+)',1,1,'ie')..

Txt can can be like this

 SUPPRESSED AND FORWARDING CTRL{ACDV 36608732875895776 } {DRID 12345

   SUPPRESSED AND FORWARDING CTRL 9809770899005 TO FRAUD DUE TO ID TH

   SUPPRESSED AND FORWARDING CTRL ACDV 987878829039161097 .DRID 87569
regexp_substr(TXT,'CTRL ACDV\\s+(\\d+)',1,1,'ie')..

need to get

36608732875895776
9809770899005 
987878829039161097
Hamed Ghasempour
  • 435
  • 3
  • 12
Pavan
  • 11
  • 4
  • HI @CodeManiac...Thank you for the quick response.Tried this regexp_substr(TXT,'(?<=CTRL ADCV\s+)(?:\d{13}|\d{17})',1,1,'ie') got an error back "Invalid regular expression: '(?<=CTRL ADCVs+)(?:d{13}|d{17})', no argument for repetition operator: ?"Any help is appreciated. – Pavan Jun 10 '19 at 02:06
  • is the space after `CTRL ADCV` is consistent ? if yes than just remmove `+` after `\s` – Code Maniac Jun 10 '19 at 02:09
  • @CodeManiac..Yes the spaces are consistent.Removed the + after \s but still gettign the same error.Invalid regular expression: '(?<=CTRL ADCVs)(?:d{13}|d{17})', no argument for repetition operator: ? – Pavan Jun 10 '19 at 02:15
  • You can see the working [Demo](https://regex101.com/r/TC2EOV/1/) here – Code Maniac Jun 10 '19 at 02:15

3 Answers3

1

If you can assume the digits are a minimum length, this works for your 3 examples:

SELECT regexp_substr('SUPPRESSED AND FORWARDING CTRL{ACDV 36608732875895776 } {DRID 12345',
                      '(\\d{13,})', 1,1, 'e');

SELECT regexp_substr('SUPPRESSED AND FORWARDING CTRL 9809770899005 TO FRAUD DUE TO ID TH',
                      '(\\d{13,})', 1,1, 'e');

SELECT regexp_substr('SUPPRESSED AND FORWARDING CTRL ACDV 987878829039161097 .DRID 87569',
                      '(\\d{13,})', 1,1, 'e');
0

If the only big numbers are the ID's, then this is the shortest and fastest:

\d{13,17}

Test it here.


Be aware that the third ID (987878829039161097) is actually 18 digits long.

Therefore, if the minimum length is 13, you may want to use:

\d{13,}

Alternatively, if you want to delete everything except the long ID's, you can search for the regex:

([^\d]+|\d{,12})

and replace it with \n (= new line) or whatever you want (e.g. a space).

Test it here.

You may get better result if you do the replace in two steps. First for:

[^\d]+

(for non-digits)

and then for:

\s\d{1,12}(\s|$)

(for numbers with less than 13 digits)

virolino
  • 2,073
  • 5
  • 21
  • hi i tried it but its giving me null values codes i tried .*\bCTRL\D+(\d{13,18})\b.* and \d{13,17}.Any help is apperciated, – Pavan Jun 11 '19 at 02:26
  • I do not understand what null values you get. Will you please provide additional details? Why do you need a complicated expression, as you suggested? – virolino Jun 11 '19 at 06:08
  • when i use this formula regexp_substr(TXT,'( '\d{13,17}').Its not giving me any results.Just showing as Null. – Pavan Jun 11 '19 at 12:33
  • In the formula you have '( ' - but in the text (the one you posted in the question), you so not have a '(', but only '{'. Also, you do not have any symbol ( '(' or '{' ) followed by digits. You have "{ACDV" – virolino Jun 11 '19 at 12:41
  • Should I understand that you want to delete the parts like "SUPPRESSED AND FORWARDING CTRL{ACDV " ? And only keep the long ID's? – virolino Jun 11 '19 at 12:43
  • Thank you for the suggestions.So the text will be anything as it a random text..The text i provided above is an example.The only goal is i need to get the 13 or 17 digit ID.The text can be anything and the ID can be anywhere in the text.I need to get only the Long ID's as i need this IDs to join to other table and get the data from those tables. – Pavan Jun 11 '19 at 12:57
  • so the formula will be somwthing like this regexp_replace(TXT,'[^\d]+') as ACDVT ,regexp_replace(ACDVT,'\s\d{1,12}(\s|$)') – Pavan Jun 11 '19 at 13:23
  • I do not know the syntax for `regexp_replace()`. I suppose you need a third parameter, the replacement string, e.g. '\n' or ' '. – virolino Jun 11 '19 at 13:25
0

You might use a capturing group and use the (from the docs) e parameter to return only the part of the string that matches the first sub-expression in the pattern.

Note that the last number are 18 digits instead of 17.

\bCTRL\D+(\d{13,18})

Explanation

  • \bCTRL Match word boundary and CTRL
  • \D+ Match 1+ times not a digit
  • (\d{13,18}) Capture 1 group 1 matching 13 - 18 digits

Regex demo

Another option is to match 13 or more digits using \d{13,}

The docs state that the patterns are implicitly anchored at both ends, in that case you could use:

.*\bCTRL\D+(\d{13,18})\b.*

Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70