0

We have a scripting engine that allows our customers to make web requests from inside of a .NET App Domain. Penetration testers pointed out that our scripting engine allows customers to make web requests against the AWS Metadata URL (http://169.254.169.254/latest/meta-data) which we need to prevent. I know how to create a whitelist of URLs using code that looks like:

        Dim perm As New System.Net.WebPermission(PermissionState.None)
        Dim metadataUrl As New Regex("http://169\.254\.169\.254/.*")
        perm.AddPermission(Net.NetworkAccess.Connect, metadataUrl)

But I want to create a blacklist with only the AWS Metadata URL. I know that for security purposes, blacklists are generally frowned upon, but more and more of our customers are using Restful API's, and we can't release a new code version every time a customer wants to talk with some new service. How can I do this? Is there a Regex pattern that will match every URL except those that match the string above?

Bert Cushman
  • 771
  • 2
  • 8
  • 27

1 Answers1

0

If I understand correctly, you want a regex that will exclude 169.254.169.254 while matching other valid URL's. If I got that right, then try this:

^(?!.*169\.254\.169\.254.*)(https?:\/\/)?([\da-z\.-]+\.[a-z\.]{2,6}|[\d\.]+)([\/:?=&#]{1}[\da-z\.-]+)*[\/\?]?$

The part that does the negative match is this: (?!.*169\.254\.169\.254.*) and it needs to go before the other things you're trying to match. It's called a negative lookahead.

And finally, if you want to tinker with it live, you may want to try Regexr where you can test it against URL's.

CAVEAT: There are probably literally thousands of different regex's for matching URL's. I just grabbed one that seems pretty good, but test it thoroughly to be sure it matches what you need it to. If not, try another recipe with the negative match part at the beginning of it. Regexr has other recipes you can try if needed. Just be sure your pattern starts with the negative lookahead.

UPDATE You can easily implement multiple negative lookaheads as documented here. There are a couple of valid syntactically similar ways to code it; here's one:

^(?!abc:)(?!defg:)

If needed, you could maintain the actual list or regex itself in a file in AWS S3 and have the client download it at runtime, thus ensuring the client is always using the latest data without you having to recode. For simplicity, I'd make the file just a list of hosts/IP's and have the client coded to parse it and build the regex. That eliminates you possibly making an error during a simple update.

technonaut
  • 484
  • 3
  • 12
  • Thanks. I tried "^(?!http://169\.254\.169\.254/).*" which seemed to do the trick. Regexr was helpful. This approach seems kludgy, since it would be hard to extend to blacklisting multiple URLs, but if no one comes up with a better solution, this might be the best we can do. – Bert Cushman Aug 11 '20 at 18:59
  • It is a bit of a work-around, but I was trying to give you what you asked for (a regex solution). You could write code to dynamically generate a regex string from a list of IP's or hostnames before using it, and/or set the client to download either the latest regex itself, or just the latest list of hosts/IP's, from an S3 bucket, so you don't ever have to change your program itself - it just gets the latest information when it needs it. Doing multiple negative lookaheads is easy - I'll update the answer with it. P.S. It's much appreciated if you upvote the answer if it works for you. – technonaut Aug 12 '20 at 14:39