1

I am not sure which stackexchange network is best for this question... but here goes:

Is it a terrible idea to implement a transparent https proxy to make use of local caching for a local network?

We are limited by bandwidth and cannot get a better connection unfortunately due to location.

The domain uses Active Directory and automatically trusts the local pki. I am nervous about allowing the proxy to impersonate all endpoints... but it would work and wouldn't require much effort to get running. So my question is this a common practice or is it unwise? Or perhaps both?

There are lots of benefits... but it seems pretty dangerous.

user319862
  • 777
  • 2
  • 8
  • 20
  • 1
    Basically it's a terrible idea. Anyway: _what_ do you want to cache? – Ipor Sircer Sep 20 '16 at 19:58
  • @IporSircer: Are you implying it would be a better practice to selectively cache based on white listed endpoints? Otherwise I am not sure what you mean by your question. I want to cache all web resources configured with appropriate cache headers – user319862 Sep 20 '16 at 20:00
  • 1
    The most of https content are unique from user to user (login+session), so you can cache for no further use. And nowadays almost all content is dynamic, so the user himself either won't use them...btw. he has his own cache in browser. So what type of content can be shared among the users? – Ipor Sircer Sep 20 '16 at 20:06
  • @IporSircer: all static media on a https site is served via https or the browser throws security exceptions. python pip downloads all packages via https. docker downloads data via https. it is more widespread than dynamic data and with search engines giving higher rankings for https encryption more and more regular http traffic will become https – user319862 Sep 20 '16 at 20:10
  • 1
    then do it and share your experiences. – Ipor Sircer Sep 20 '16 at 20:12
  • @IporSircer well I am thinking it is a terrible idea like you orignally said =P. I was hoping it was common practice and I could overlook it that way – user319862 Sep 20 '16 at 20:13

1 Answers1

1

As many have said in the comments, it is a terrible idea for many reasons. The first and foremost among them is that the vast majority of content is dynamic, and user specific, and it wouldn't make sense to cache anyway. Furthermore, your cache would inherently lack the ability to determine sensitive information from non-sensitive information [edit] if a website is misconfigured. [/edit]

Imagine that I'm User A and I visit my email inbox at a hypothetical address https://www.qwertyuiop-mail.com/inbox Their server has identified me as having logged in, and assumes a secure connection, and shows all the messages to me.

User B decides to check his email also at https://www.qwertyuiop-mail.com/inbox and because he just went directly there, your cache server says, "Hey! I have this page cached! I'll just serve User B what I just served User A" and bam, User B now sees User A's page, and probably a sizable portion of what only User A should see.

As far as the server owner is concerned, his system should work just fine because he isn't expecting anything to be cached. You will have effectively created a passive man-in-the-middle attack that any user on your system could potentially abuse.

If you want to do this kind of thing, I would recommend only doing it for certain white-listed sites that you know contain no login anything, or for which you prohibit logging in in some way. (Wikipedia may be a good candidate for this as long as you block login to the site)

Andrew Hendrix
  • 306
  • 1
  • 8
  • 1
    I'll add a final thought down here: With a blanket approach, all it takes is one website that is incompatible with your cache server, and suddenly you have an enormous security problem on your hands. – Andrew Hendrix Sep 20 '16 at 20:29
  • Just want to comment that "your cache would inherently lack the ability to determine sensitive information from non-sensitive information" is wrong. Http headers indicate what is allowed to be cached or not. Sensitive information is marked with http headers that indicate what caching is allowed... I wrote my question assuming the web cache would respect cache headers as a proper and responsible web citizen – user319862 Sep 20 '16 at 20:35
  • For example, the situation you describe is prevented by "Vary: Cookie" - edit - or disallowing public caching outright with pragma – user319862 Sep 20 '16 at 20:38
  • 1
    I should probably have been more clear then. If a site is improperly configured, your cache server will cause a man-in-the-middle attack. The safest way to mitigate that is to test common sites and whitelist them one by one. They would never know their site is misconfigured because this type of scenario isn't common practice. – Andrew Hendrix Sep 20 '16 at 20:41
  • understood - this is certainly something to consider. But I think for sites served over the internet this shouldn't be an issue because most sites are served over a CDN (reverse proxy instead of forward proxy) and if the wrong caching headers are present then the scenario you've described is still observed – user319862 Sep 20 '16 at 20:48
  • 1
    For sites using a reverse proxy successfully, you would probably be ok, but like I said before, it only takes one exception to cause a security problem. – Andrew Hendrix Sep 20 '16 at 20:53
  • 1
    I am leaning towards not allowing the network proxy or insisting on a whitelist approach targeting bandwidth intensive sites. Thank you for the discussion and comments – user319862 Sep 20 '16 at 20:55
  • 2
    HTTPS is _not_ cached by default, and not at all unless the server sends a Cache-Control response header allowing a resource to be cached. This is the reverse of (insecure) HTTP, which _is_ cacheable by default, and _is_ cached unless a Cache-Control header specifies not to cache it. – Michael Hampton Sep 20 '16 at 21:41