0

I have looked at multiple posts about this, and am still having issues.

I am attempting to write a regex query that finds the names of S3 buckets that do not follow the naming scheme we want. The scheme we want is as follows:

test-bucket-logs**-us-east-1**

The bolded part is optional. Meaning, the following two are valid bucket names:

  1. test-bucket-logs
  2. test-bucket-logs-us-east-1

Now, what I want to do is negate this. So I want to catch all buckets that do not follow the scheme above. I have successfully formed a query that will match for the naming scheme, but am having issues forming one that negates it. The regex is below:

^(.*-bucket-logs)(-[a-z]{2}-[a-z]{4,}-\d)?$

So some more valid bucket names:

  1. example-bucket-logs-ap-northeast-1
  2. something-bucket-logs-eu-central-1

Invalid bucket names (we want to match these):

  1. Iscrewedthepooch
  2. test-bucket-logs-us-ee
  3. bucket-logs-us-east-1

Thank you for the help.

Lawrence Aiello
  • 4,560
  • 5
  • 21
  • 35
  • Negating a regexp is generally difficult. Either use a negative lookahead, or do the negation in the code that performs the match -- remove the items that match instead of keeping them. – Barmar Dec 15 '16 at 16:38
  • I think negating the result in the code is what I will go with. Thank you! – Lawrence Aiello Dec 15 '16 at 16:55

1 Answers1

2

As mr Barmar said, probably the best approach on these circumstances is solving it programatically. You could write the usual regex for matching the right pattern, and exclude them from the collection.

But you can try this:

^(?:.(?!-bucket-logs-[a-z]{2}-[a-z]{4,}-\d|-bucket-logs$))*$

which is a typical solution using a negative lookeahead (?!) which is a non-capturing group, with zero-length. Basically it states that you want every line that starts with something but dont has the pattern after it.

EDITED

As Ibrahim pointed out(thank you!), there was a little issue with my first regex. I fixed it and I think it is ok now. I had forgot to set the last part of inner regex as optional(?).

Lawrence Aiello
  • 4,560
  • 5
  • 21
  • 35
Victor Lia Fook
  • 420
  • 4
  • 15
  • 1
    You code matches `test-bucket-logs` which is a valid bucket names. Consider adding an exception to exclude it, maybe something like this: `^(?:.(?!-bucket-logs-[a-z]{2}-[a-z]{4,}-\d|-bucket-logs$))*$` – Ibrahim Dec 15 '16 at 17:33