7

Im sorry if this is asked and has an answer but I can't find it.

I know about regex lookarounds and negative lookahead.

Thing is that negative lookahead examines what comes right after current position in a string.

What I need is to find and discard matches if string contains words like "career(s)" and "specials" for example, but if it contains them anywhere in the string.

What would be the efficient way of doing that?

At the moment I'm using PCRE flavor but the more general regex is, the better.

Thank you.

toni rmc
  • 848
  • 2
  • 10
  • 25

1 Answers1

8

You can use this regex:

^(?!.*(?:career\(s\)|specials)).*

Or if s is optional then use:

^(?!.*(?:career|special)s?).*

RegEx Demo

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    Super helpful, I've been looking for this for a long time, thanks a lot! – LoneCodeRanger Oct 31 '22 at 13:42
  • I do have a tiny follow-up question though: what if we were to do the negative lookahead anywhere in a string *except* after a particular character, how would we do that? Say for instance we don't want to match a string containing "career" in between certain parentheses, as in `person = Person(career='their_job')`, but we would want to match `person = Person(); career = 'their_job'`, because "career" appears *after* the closing parenthesis `)`. – LoneCodeRanger Oct 31 '22 at 14:18
  • You can try: `^(?!.*\([^()]*career[^()]*\)).*` – anubhava Oct 31 '22 at 14:24
  • 1
    That looks very good indeed! I was just about to get to something along those lines too but you were faster! Thank you sir! – LoneCodeRanger Oct 31 '22 at 14:26
  • 1
    By the way, the debugger on https://regex101.com/ is very helpful, I just discovered it thanks to your link, and I'll be using it from now on! – LoneCodeRanger Oct 31 '22 at 14:27
  • Okay I have one last challenge if you're up to it, but I know this is asking a lot, so I won't mind if you don't answer this one. What if we wanted to only take into account the `)` if it is the one matching the opening one. So that we *would not* get a match for `Person(email=email.lower(), career='their_job')` (notice the call to `lower` already closes one parenthesis, thus the negative lookahead currently stops there). I don't even know if this is possible with regex, but it's a real use-case I have right now while searching for particular usages of a class in code. – LoneCodeRanger Oct 31 '22 at 14:58
  • Again, I am more than happy with what we got until now, so this is just kind of my perfectionism asking. – LoneCodeRanger Oct 31 '22 at 14:59
  • 1
    Matching nested parentheses is supported only in PCRE but not supported on other flavors – anubhava Oct 31 '22 at 15:12
  • 1
    Well, thanks again sir :) This has already considerably reduced the number of matches we need to consider for an issue we have. – LoneCodeRanger Oct 31 '22 at 15:24