2

We have an application that accepts URLs from users. This data needs validation, and we're using ESAPI for this purpose. However, we're struggling with URLs containing ampersands.

The problem appears when ESAPI canonicalizes the data before validation. &pid=123 in the URL turns into πd=123 for example. Since π is not whitelisted, the validation fails.

I've tried encoding it, but ESAPI is smarter than that and does canonicalization to avoid double encoding and mixed encoding. I'm a bit stumped here and I'm not sure how to proceed.

user2754648
  • 25
  • 1
  • 4
  • I just answered an almost identical question here: http://stackoverflow.com/a/23448264/557153 – avgvstvs May 03 '14 at 18:45
  • As it stands right now, Sri's answer introduces a vulnerability by shutting off canonicalization. See http://cwe.mitre.org/data/definitions/180.html – avgvstvs May 06 '14 at 11:21
  • Thanks, but it doesn't seem to help me. URI.getPath() only returns the decoded path component of the URI. So I would end up only validating a portion of the input with this method. – user2754648 May 07 '14 at 12:00
  • Study the URI API. There's also `URI.getScheme()`, `URI.getAuthority()`, `URI.getPath()`, and `URI.getQuery()` to name just a few pieces. I also just updated my answer to demonstrate how to canonicalize an entire URL using library parsers. You need to break it up and rebuild it in order for it to work correctly. – avgvstvs May 07 '14 at 13:03

2 Answers2

1

I faced the same issue. In my case, for the string \fgdf\gghfh\fgh\dff the canonicalize method formed this into:

Case 1: canonicalize(string) --> INTRUSION - Multiple (2x) encoding detected in \fgdf\gghfh\fgh\dff

Case 2: canonicalize(string, false) --> input=fgdfgghfhfghdff And in this case, it failed with string validation since this ? character is not part of white list of characters.

I finally managed to get it working. Below is the code:

    value = ESAPI.encoder().encodeForURL(value);
    value = value.replaceAll("", "");
    isSafe = validator.isValidInput("APPNAME", value, "URLSTRING", 255, true, false);

The last parameter of false turns off internal canonicalization that is on by default.

I hope this helps.

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
Sri
  • 63
  • 1
  • 1
  • 8
  • I fail to see what the call `value = value.replaceAll("", "");` accopmlishes, if anything? – avgvstvs May 03 '14 at 18:32
  • This evades the protection that canonicalize is supposed to afford you in the first place. Why even use ESAPI then? -1 – avgvstvs May 03 '14 at 18:45
  • @avgvstvs `value = value.replaceAll("", "");` That's part of standard ESAPI procedure to eliminate any blank characters. I didn't take time to explore in more depth about what it does. Also, the canonicalize is not being done only on path fields, identified by form field names. And since we are skipping canonicalize, the whitelist character set becomes very restrictive. – Sri May 05 '14 at 14:23
  • 1
    but you're applying whitelist characters against dirty input. That's a security no-no. Without a call to `*.canonicalize()` you cannot trust that your regexes won't be fooled by different types of character encoding attacks. That's the entire point behind `canonicalize`. – avgvstvs May 05 '14 at 19:19
  • having worked with ESAPI now for nearly 3 years, I've never once seen a reference to `value.replaceAll("", "")` as any kind of a standard. What do you mean by "blank characters?" Can you provide a test case? – avgvstvs May 05 '14 at 19:32
  • Here's the CWE that is relevant to validation before canonicalization: http://cwe.mitre.org/data/definitions/180.html – avgvstvs May 06 '14 at 11:20
  • Here's a test case that demonstrates beyond any reasonable doubt that `String.replaceAll("","")` does absolutely nothing: stackoverflow.com/questions/23587519/esapi-and-using-replaceall-for-blank-strings – avgvstvs Aug 01 '14 at 17:47
0

This problem is a known bug in ESAPI. I started working on resolving it, but since I don't know when a patch will get committed, I can only refer you to a workaround in my comments to the OP here where I linked a similar answer, using java.net.URI and javax.ws.rs.core.UriBuilder to parse/break down the URL, canonicalize the pieces, and then reconstruct the URL. I'll repost the link here. The example I put forth is on the second half of the question after the OP switched topics mid-question.

Community
  • 1
  • 1
avgvstvs
  • 6,196
  • 6
  • 43
  • 74
  • This took care of things. The UriBuilder output malforms the start of the URL, but it's fairly easy to fix. Thanks for the help :) – user2754648 May 12 '14 at 13:21
  • I fixed it last June, it'll be in the next release, but the heavily tested code lives here: https://github.com/ESAPI/esapi-java-legacy/blob/develop/src/main/java/org/owasp/esapi/reference/DefaultValidator.java – avgvstvs Feb 02 '17 at 15:22