I have a web application with very varying response times and can't find out what causes that.
That's the (quite complicated) setup:
Powerful (i7, 32GB RAM) dedicated server in a German data center, running Debian 6 with Proxmox. On that host, there's an OpenVZ container configured to use 2 CPU cores and 2 GB of RAM. In this container, I run Ubuntu 12.04 and a Redmine instance (Rails Application).
Redmine is used with Apache 2 through Phusion Passenger.
Apache vhost config (container):
<VirtualHost *:80>
ServerName redmine.somedomain.com
DocumentRoot /var/www/redmine/
<Directory "/var/www/redmine/">
RailsBaseURI /
PassengerResolveSymlinksInDocumentRoot on
Options FollowSymLinks
AllowOverride All
Order allow,deny
Allow from all
</Directory>
RewriteEngine On
# Check for maintenance file and redirect all requests
RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
RewriteCond %{SCRIPT_FILENAME} !maintenance.html
RewriteRule ^.*$ /system/maintenance.html [L]
# Rewrite index to check for static
RewriteRule ^/$ /cache/index.html [QSA]
# Rewrite to check for Rails cached page
RewriteRule ^([^.]+)$ /cache/$1.html [QSA]
ErrorLog /var/log/apache2/redmine.error.log
CustomLog /var/log/apache2/redmine.access.log combined
ServerSignature Off
</VirtualHost>
Passenger config:
<IfModule mod_passenger.c>
PassengerRoot /var/lib/gems/1.9.1/gems/passenger-3.0.19
PassengerRuby /usr/bin/ruby
PassengerDefaultUser redmine
PassengerDefaultGroup redmine
PassengerPoolIdleTime 0
PassengerMinInstances 4
PassengerMaxPoolSize 10
PassengerStatThrottleRate 600
RailsFrameworkSpawnerIdleTime 0
RailsAppSpawnerIdleTime 0
</IfModule>
The container VM doesn't have an external IP, thus on the host I use an Apache (reverse) proxy.
The config for that is (host):
<VirtualHost *:443>
ServerName redmine.somedomain.com
SSLProxyEngine On
ProxyRequests off
ProxyPreserveHost on
ProxyPass / http://192.168.2.101/ keepalive=on max=100
ProxyPassReverse / http://192.168.2.101/
SSLEngine on
SSLCertificateFile /data/private/101/etc/apache2/ssl/redmine.crt
SSLCertificateKeyFile /data/private/101/etc/apache2/ssl/redmine.key
SSLCACertificatePath /data/private/101/etc/ssl/certs/
RequestHeader set X_FORWARDED_PROTO 'https'
KeepAlive On
KeepAliveTimeout 60
<Proxy *>
Order allow,deny
Allow from all
</Proxy>
ErrorLog /var/log/apache2/redmine.err.log
LogFormat "%t \"%r\" %D" measure-time
CustomLog /var/log/apache2/redmine.time.log measure-time
</VirtualHost>
As you can see here, I activated a timed log to find out how single requests behave. According to that log, it seems to be quite random.
Example contents of this log file:
[10/Feb/2014:09:48:36 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 501
...
[10/Feb/2014:09:48:35 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 20994933
...
[10/Feb/2014:09:49:07 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 418
The last value is the time (in µs) it took to process the request. As you can see here, usually it takes only about 500 µs to serve it, but the exact same request can also take as long as 20 seconds one minute later. In my opinion, that should exclude the Ruby process as possible cause. This hypothesis is supported by the fact that during such slow requests, the server doesn't show any load (be it CPU or I/O). It's also independent from the load of other VMs on the host. Seems to be completely random.
Because of the specific setup, there are a lot of possible causes and I don't really know where to start.
Maybe somebody with more experience with the involved components can give me a hint how to approach diagnosis of this issue.