1

I am trying to create a custom regex to detect social security numbers in O365 DLP. The conditions are the first three digit number should not started from 000 or 666 or 150 and the last ending four digit numbers should not end with 0000. Therefore i came up with the regex below,

(?!000|666|150)\d{3}-\d{2}-(?!0000)\d{4} - This works fine

Need Solution: what if i want to exclude the same pattern if it starts by a word say for an example Apple: 173-12-9878 or Content: 173-12-9878, i tried adding the word into the negative lookahead like (?!Apple: |Content: )(?!000|666|150)\d{3}-\d{2}-(?!0000)\d{4}, but am not able to get this work.

Please advise and also suggest if there is a better way to achieve this. Thanks.

1 Answers1

0

Use a regex with a lookbehind:

\b(?<!Apple: |Content: )(?!0{2}|666|150)\d{3}-\d{2}-(?!0{4})\d{4}\b

See proof & explanation.

The (?<!Apple: |Content: ) negative lookbehind will prevent matches after Apple: and Content:.

Note \b is word boundary, it will disallow matches of longer numbers than you expect.

Ryszard Czech
  • 18,032
  • 4
  • 24
  • 37