How to write a regular expression inside awk to IGNORE a word as a whole?

Question

I want to exclude a word say "Dogma" in my regular expression inside an awk script such that all other lines except the one containing "Dogma" is captured in the regex. How can I do that ?

input

<animal>Because I am a seeker of truth, I do not accept every bit of dogma as fact</animal><animal>The dog kept barking all night</animal><animal>The mills of God grind slowly</animal>

my regex

reg = "<" animal ">" [(^Dogma)+]</" animal ">"

desired output

<animal>Because I am a seeker of truth, I do not accept every bit of dogma as fact</animal><animal>The d## kept barking all night</animal><animal>The mills of G## grind slowly</animal>

I am matching the line with the regex and if the line matches the regex defined above, it will substitute the desired word by extracting it and replacing with #. The logic is working fine for the other scenarios but not this one. As this regular expression is ignoring even the lines which has "Dogs" or "Gods" in it. How can I make regex ignore the word as a whole ? Any suggestion would be appreciated.

This has also been [discussed here](https://unix.stackexchange.com/questions/318839/awk-negative-regular-expression). If AWK supported PCRE, it could be done using negative lookahead. — MyICQ, Mar 14 '22 at 06:42
Just to spell out the misunderstanding, `[^abc]` means "match _a single character_ which is not one of `a`, `b`, or `c`." There is no facility to directly say "match a string which isn't `abc`" in Awk regex. — tripleee, Mar 14 '22 at 07:29

How to write a regular expression inside awk to IGNORE a word as a whole?

0 Answers0