1

I'm implementing a web filter for my organization and I'm considering zScaler. I do not want to use proxy PAC file. I just got off the phone with zScaler's sales and they claim that they can differentiate my users post-NAT using cookies. They did not explain how it works but showed me a demo. My topology is as follows:

RFC 1918 Space -> FW -> 1.1.1.1 --- 1.1.1.2 -> Router --> Internet

Essentially at the router above I will GRE tunnel to their ZEN node. The ZEN will only see my public IP, 1.1.1.1.

Upon first visiting the internet I will have to authenticate. After that, the user sessions are tracked using cookies. This doesn't make sense to me because:

  • Two sites, cnn.com and reddit.com, for example will have completely different cookies set by my browser. zScaler will see something like:
    • 1.1.1.1:23883 --> cnn.com:80 + HTTP headers and potentially cookies sent by the browser which are unique to cnn.com and don't necessarily ID me as joeDomainUser.
    • 1.1.1.1:26364 --> reddit.com:80 + HTTP headers and potentially cookies sent by the browser which are unique to reddit.com and don't necessarily ID me as joeDomainUser.

Sure if I authenticate going to cnn.com it can inject a cookie into the response, but how will this track me when going to reddit.com? The browser will send different cookies.

NAScar0
  • 13
  • 1
  • 3

1 Answers1

0

You need to remember that, as a proxy, they are in the path for all web requests (or at least, all non-SSL requests unless you're using SSL decryption).

The first time you go to cnn.com they will see that you do not have one of their authentication cookies for that site. Rather than serving the site, they will temporarily redirect you to their authentication site.

Once you get to that authentication site one of two things will happen. If you are already authenticated to Zscaler (which they can tell using cookies, which in this case are for their auth site), then they will redirect you back to cnn.com, with a query string (?xxx=yyy added to the URL) that uniquely identifies you. When your browser follows that redirect they will again intercept the request, again redirect you - this time back to the original URL (cnn.com without the extra query string), AND they will set a zscaler authentication cookie for the cnn.com domain, which they can do because as far as your browser is concerned the response is coming from the real cnn.com so it allows the cookie to be set (even though the response is really coming from Zscaler).

After that, every request to cnn.com will include their auth cookie for that domain, so they will know that it's you making the request. After a period of time (24 hours I think) that cookie will timeout, and they will go through the whole redirect process again.

If you weren't already authenticated to Zscaler during the first redirect above then they would make you login, and then follow the same process.

If you look through your cookie list after enabling authentication like this you'll find that every site has a new cookie being set for it - that's the Zscaler auth cookie for that domain.

Despite what they might claim, the way they are doing this isn't all that unique - various other web security products use exactly the same mechanism. There is a slight performance impact due to all of the redirects (which occur at least once for every domain, every 24 hours) - but generally it's not that significant to be noticeable.

Doc
  • 116
  • 1
  • They claim they are not proxying though. They claim this is done transparently when you GRE to the solution, unless you are injecting proxy headers in the firewall? – NAScar0 Dec 22 '16 at 21:11
  • They are transparent proxy. – Doc Dec 22 '16 at 21:50
  • I get that but how is it differentiating my traffic. Let's say I go to cnn.com and I don't have the magic cookie. I probabally get 302 redirected to the login box. Let's say I'm cookied to the login site. How does it know I went to cnn originally? When I then get 302'd back to cnn how does it know know that I'm the original joeUser. I'm lost on how it's connecting my authentication to the login box website to cnn.com and any other site. Thanks for the detailed answer. Well earned upvote :). – NAScar0 Dec 22 '16 at 22:04
  • When it does the redirect, information like the site you were going to is included in the redirect (in the query string send to the auth site). I already covered your second question - the user details are also passed in the query string of the redirect. Note that this is all EXACTLY the same regardless of whether you're using pac files or GRE. I'd suggest you install something like Fiddler or LiveHTTPHeaders and look for yourself - it's all fairly obvious once you see it. – Doc Dec 24 '16 at 07:30