0

I need to replace all instances where a new line is not followed by a colon. This is because colon indicates the start of a new record (a record may span multiple lines).

REGEXP_REPLACE(raw,'(\n[^:])','this is a test')

For example, with the input:

:[MY DATA HERE]
[rest of record printed here]

produces the output:

:[MY DATA HERE]
this is a testrest of record printed here]

instead of:

:[MY DATA HERE]this is a test[rest of record printed here]

Note that the current output replaces the first non ':' character but not the new line. Does anyone know why this is not working as expected?

Thanks in advance

Barmar
  • 741,623
  • 53
  • 500
  • 612
Anthony Wood
  • 395
  • 1
  • 3
  • 16
  • The regexp matches the newline and the character after it, and replaces both of them. – Barmar Nov 19 '19 at 03:08
  • Does EXASolution allow you to use back-references in the replacement string? Or do they support negative lookaheads? – Barmar Nov 19 '19 at 03:09
  • Yes they do, but I'm not sure how to use them https://docs.exasol.com/sql_references/functions/alphabeticallistfunctions/regexp_replace.htm – Anthony Wood Nov 19 '19 at 03:10
  • It should replace the newline and the character after but the output still has the newline – Anthony Wood Nov 19 '19 at 03:11

1 Answers1

1

You're replacing the newline and the character after it. You need to capture the character after it, and put that in the replacement. You can refer to capture groups using \number

REGEXP_REPLACE(raw,'\n([^:])','this is a test\1')

See the documentation

Exasol also supports PCRE, so you should be able to use a negative lookahead:

REPLACE_REGEXP(raw, '\n(?!:)', 'this is a test')

These differ in one subtle way: Your regexp won't match a newline at the end of the string, since there's nothing to match [^:] there. The negative lookahead will match at the end.

Barmar
  • 741,623
  • 53
  • 500
  • 612