0

My production server is Windows Server 2012 R2, Apache 2.4, PHP 5.6, and MariaDB (unsure of version, probably irrelevant). I have a customized site running on Silverstripe 3.5.

Lately, I'm getting ridiculous TTFBs with certain actions, most notably logging in (over 30 seconds), or sometimes at random while browsing the backend. Curiously, this does not happen on our development server which is configured similarly, and our IT department is going to pour over httpd.conf and php.ini to see if there are any differences which could be an issue.

I enabled the Zend OpCache extension in hopes that it would help, but I'm not seeing any significant improvements.

I installed XDebug and triggered profiling while logging in (by adding a hidden form field named XDEBUG_PROFILE via the DOM inspector), but the output when viewed in WinCachegrind says that PHP's execution took no more than 5 seconds cumulatively - i.e. the total cumulative time in main.php was 4.x seconds. There doesn't seem to be a notable bottleneck in the framework logic or database access.

httpd.exe takes up a solid 14% of my CPU for the duration of the request. Watching the logs with Powershell's gci -Wait shows that the request isn't logged in access.log until after the response is sent (not sure if normal behavior) and there are no errors popping up.

I'm stumped as to why Apache is taking so long to process a request that PHP claims to only take 4 seconds to perform logic on. While potential solutions may be appreciated, I'd really like to know where else I can look in terms of diagnostics to figure out what's causing Apache to slow down like this, e.g. file access, tracing execution, etc? I haven't the foggiest as to where to even begin looking.

p0lar_bear
  • 103
  • 1
  • 7
  • 1
    You look at the part you inexplicably skipped over, despite it being the most important part of the slowest requests: the database. And of course, why on earth are you attempting to run Apache/PHP/MySQL on Windows?! – Michael Hampton Sep 07 '18 at 20:24
  • Because my company contracts for some US government organizations, and thus the infrastructure is a Windows domain network thanks to bureaucracy imposing obtuse security requirements that must be followed. That's what I think, anyway - I'm just a peon codemonkey in this organization. Everything here is Windows by default, and honestly if I knew the first thing about programming ASP.NET back at the outset of this project, I probably would have used a .NET language. – p0lar_bear Sep 07 '18 at 20:30
  • @MichaelHampton "You look at the part you inexplicably skipped over, despite it being the most important part of the slowest requests: the database." What and where should I be looking for, exactly? Because if the database was the bottleneck, I expect I would have seen something indicating as such in the profiler output, like high average or cumulative time spent in the code of the database abstraction layer. – p0lar_bear Sep 07 '18 at 20:35
  • OK, so what did you see in the profiler output? – Michael Hampton Sep 07 '18 at 20:39
  • I already said that in the question: the cumulative time spent in the script's entire execution was ~4.5 seconds. 4 seconds =/= 30 seconds, hence my complete bewilderment. And, drilling down to database access calls showed negligible average and cumulative times. – p0lar_bear Sep 07 '18 at 20:43
  • So what's taking the most time, then?! – Michael Hampton Sep 07 '18 at 20:49
  • What's taking the most time in that profiler output is irrelevant to the issue at hand. Again: It's saying that the total time it took Silverstripe's `main.php` (and any other scripts it `include`d and `require`d) to take my request and generate output was 4 seconds. Ergo, everything else in the list took less time than that to be called and run. The mystery is what the hell is taking Apache those additional 20-30 seconds to respond to my request? Because according to the profiler output, the database is not blocking script execution for anywhere near that long. – p0lar_bear Sep 07 '18 at 21:06
  • Let's be perfectly clear: You're saying that a 30 second request only spent 4 wall clock seconds in PHP? Or 4 CPU seconds (which could be 4 actual seconds or an actual week)? – Michael Hampton Sep 07 '18 at 21:09
  • The "Cumulative Time" column in WinCachegrind's interface says a number in the range of 4000ms-5000ms. I'm assuming that is actual time elapsed and not CPU time. – p0lar_bear Sep 07 '18 at 21:11

1 Answers1

0

The database was not the bottleneck. Ironically enough, it was the cache - it's a known issue that Zend_Cache_Backend_File-based caches are disgustingly slow if the site has a lot of data, templates, and images to be cached. Surely enough, my silverstripe-cache folder on the server had over 5,000 files in it.

Adding the following line to ./mysite/_config.php to disable caching resulted in immediate, overwhelming improvements (35 second TTFB on login down to less then 0.5 seconds!):

SS_Cache::set_cache_lifetime('default', -1, 100);

However, this is a temporary fix. We will be looking into implementing a more permanent, better-practice solution for handling caching, such as using a ramdisk in conjunction with the filesystem backend, or leveraging a backend which uses memory pools like XCache or Memcached, as well as working towards moving to Silverstripe 4.x which dumps Zend_Cache in favor of symfony/cache.

But seeing as my office loves to let bureaucracy get in the way of everything... might be a bit before that happens.

p0lar_bear
  • 103
  • 1
  • 7
  • Only 5000 files? that's not very many. – hookenz Sep 20 '18 at 20:55
  • 5,000 files, in which I'm certain a fairly-sized portion of them were being browsed for, accessed, and rewritten on every single call to `DataObject::write()`. Disk I/O slows right down when dealing with multiple files - in my experience, copying, say, one 1GB file tends to go a lot faster than writing 1,024 1MB files. – p0lar_bear Sep 24 '18 at 18:41