1

We currently experience a diminished with one of our customers at our main production site. All subpages and resources seem to be affected as well.

The customer reports a completely broken experience for themselves with the site not working correctly at all, mostly due to assets not loading correctly.

We already started investigating and have found that - so far - nothing seems to be wrong with the site itself.

Quick rundown:

  • The production site has a Cloudflare layer and almost all of it's assets are delivered either via CDNjs or Amazon's Cloudfront (behind Cloudflare) - all assets are reachable via HTTP as well
  • The site uses SSL and enforces it (the dynamic cert from Cloudflare)
  • We could secure a HAR from one of the requests for the request to one of our sites, the request times are extremely long. If you like to try, here is an online HAR viewer, be sure to uncheck validation of the file.

The customer uses Internet Explorer 8 and Chrome (39). While the site is not optimized for IE8. It should run fine in Chrome, in fact, in runs in most browsers above IE9 just fine for all of us.

Notes

We already ruled out:

  • Virtual delivery problems (there could be physical limitations we are not aware of)
  • General faultiness of our setup (We tried three different open VPNs to verify this)
  • Being on the customers blacklist by accident (although we cannot be entirely sure of this)
  • SSL Server name indication (SNI) problems
  • (Potentially) a general problem with the customers network, the customer does not report any problems with "the rest of the internet".

The customer will not give access to their VPN/disclose security details so we cannot really test for the situation ourselves. We suspect that the customer uses an internal proxy that might cause the problems described, but we are not sure.

Questions

My questions here are:

Is there any known problem caused by internal networking in conjunction with our setup that can cause this behaviour?.

Are there potential problems on our end that we could have overlooked or things that we do different from other sites?

Florian
  • 3,366
  • 1
  • 29
  • 35

2 Answers2

1

It seems the connection is being done (or routed) through a low bandwidth high latency link (or a very congested one). Most of the dns lookups and connects seems to be taking ~10s.

In the HAR you can see that it affects fonts.googleapis.com and cdnjs.cloudflare.com. https://www.google-analytics.com/analytics.js has no data captured. To me the affirmation that the customer does not report any problems with "the rest of the internet" seems kind of dubious, seeing that in this HAR it hasn't been able to load the analytics js and access to usual cdns are very slow.

My guesses (pick one or more):

  • they are testing in a machine different than the one they have no problems with "the rest of the internet"
  • this machine is very, very slow
  • it has some kind of content filtering, antivirus, whatever filtering the web (perhaps with a ssl certificate installed in order to forge & inspect https traffic)
  • the access is done through a congested route, or a low bandwidth high latency link
Jorge Nerín
  • 766
  • 6
  • 8
1

Two hotspots:

  1. It happens sometime for CDN points to be inconsistent, I spent a lot of time to understand this issue. How? In a live session with the client when I opened each resource loaded one by one I understand there are differences between CDN access points (Mine eastern Europe - His central Europe ). CDN hosting was one of the biggest US player in the world, anyhow we fixed this by invalidating(deleting) all files from CDN as so new/correct ones were loaded.

  2. You need to have CDN that supports serving files over HTTPS, then use that CDN for the SSL requests.

SilentTremor
  • 4,747
  • 2
  • 21
  • 34