1

I have a pipe delimited file which has a line

H||CUSTCHQH2H||PHPCCIPHP|1010032000|28092017|25001853||||

I want to substitute the date (28092017) with a regex "[0-9]{8}" if the first character is "H"

I tried the following example to test my understanding where Im trying to subtitute "a" with "i".

str = "|123||a|"
str.gsub /\|(.*?)\|(.*?)\|(.*?)\|/, "\|\\1\|\|\\1\|i\|"

But this is giving o/p as "|123||123|i|"

Any clue how this can be achieved?

Banjo
  • 91
  • 2
  • 9

1 Answers1

2

You may replace the first occurrence of 8 digits inside pipes if a string starts with H using

s = "H||CUSTCHQH2H||PHPCCIPHP|1010032000|28092017|25001853||||"
p s.gsub(/\A(H.*?\|)[0-9]{8}(?=\|)/, '\100000000')
# or
p s.gsub(/\AH.*?\|\K[0-9]{8}(?=\|)/, '00000000')

See the Ruby demo. Here, the value is replaced with 8 zeros.

Pattern details

  • \A - start of string (^ is the start of a line in Ruby)
  • (H.*?\|) - Capturing group 1 (you do not need it when using the variation with \K): H and then any 0+ chars as few as possible
  • \K - match reset operator that discards the text matched so far
  • [0-9]{8} - eight digits
  • (?=\|) - the next char must be |, but it is not added to the match value since it is a positive lookahead that does not consume text.

The \1 in the first gsub is a replacement backreference to the value in Group 1.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    Thanks @Wiktor Stribiżew Could you please elaborate "(?=\|)". Why is it not added to the match value – Banjo Aug 28 '19 at 12:15
  • @Navi *it is not added to the match value since it is a positive lookahead that does not consume text* - "*lookaround actually matches characters, but then gives up the match, returning only the result: match or no match. That is why they are called "assertions". They do not consume characters in the string, but only assert whether a match is possible or not.*", see more about [lookaheads](https://www.regular-expressions.info/lookaround.html). – Wiktor Stribiżew Aug 29 '19 at 07:37