0

I have a web application with very varying response times and can't find out what causes that.

That's the (quite complicated) setup:

Powerful (i7, 32GB RAM) dedicated server in a German data center, running Debian 6 with Proxmox. On that host, there's an OpenVZ container configured to use 2 CPU cores and 2 GB of RAM. In this container, I run Ubuntu 12.04 and a Redmine instance (Rails Application).

Redmine is used with Apache 2 through Phusion Passenger.

Apache vhost config (container):

<VirtualHost *:80>
  ServerName redmine.somedomain.com
  DocumentRoot /var/www/redmine/
  <Directory "/var/www/redmine/">
RailsBaseURI /
PassengerResolveSymlinksInDocumentRoot on
Options FollowSymLinks
AllowOverride All
Order allow,deny
Allow from all
  </Directory>
  RewriteEngine On
  # Check for maintenance file and redirect all requests
  RewriteCond %{DOCUMENT_ROOT}/system/maintenance.html -f
  RewriteCond %{SCRIPT_FILENAME} !maintenance.html
  RewriteRule ^.*$ /system/maintenance.html [L]

  # Rewrite index to check for static
  RewriteRule ^/$ /cache/index.html [QSA]

  # Rewrite to check for Rails cached page
  RewriteRule ^([^.]+)$ /cache/$1.html [QSA]

  ErrorLog /var/log/apache2/redmine.error.log
  CustomLog /var/log/apache2/redmine.access.log combined

  ServerSignature Off
</VirtualHost>

Passenger config:

<IfModule mod_passenger.c>
  PassengerRoot /var/lib/gems/1.9.1/gems/passenger-3.0.19
  PassengerRuby /usr/bin/ruby
  PassengerDefaultUser redmine
  PassengerDefaultGroup redmine
  PassengerPoolIdleTime 0
  PassengerMinInstances 4
  PassengerMaxPoolSize 10

  PassengerStatThrottleRate 600
  RailsFrameworkSpawnerIdleTime 0
  RailsAppSpawnerIdleTime 0
</IfModule>

The container VM doesn't have an external IP, thus on the host I use an Apache (reverse) proxy.

The config for that is (host):

<VirtualHost *:443>
ServerName redmine.somedomain.com
SSLProxyEngine On
ProxyRequests off
ProxyPreserveHost on
ProxyPass / http://192.168.2.101/ keepalive=on max=100
ProxyPassReverse / http://192.168.2.101/

SSLEngine on
SSLCertificateFile /data/private/101/etc/apache2/ssl/redmine.crt
SSLCertificateKeyFile /data/private/101/etc/apache2/ssl/redmine.key
SSLCACertificatePath /data/private/101/etc/ssl/certs/

RequestHeader set X_FORWARDED_PROTO 'https'

KeepAlive On
KeepAliveTimeout 60

<Proxy *>
      Order allow,deny
      Allow from all
</Proxy>
ErrorLog /var/log/apache2/redmine.err.log

LogFormat "%t \"%r\" %D" measure-time
CustomLog /var/log/apache2/redmine.time.log measure-time
</VirtualHost>

As you can see here, I activated a timed log to find out how single requests behave. According to that log, it seems to be quite random.

Example contents of this log file:

[10/Feb/2014:09:48:36 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 501
...
[10/Feb/2014:09:48:35 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 20994933
...
[10/Feb/2014:09:49:07 +0100] "GET /plugin_assets/redmine_contacts_helpdesk/stylesheets/helpdesk.css?1377871228 HTTP/1.1" 418

The last value is the time (in µs) it took to process the request. As you can see here, usually it takes only about 500 µs to serve it, but the exact same request can also take as long as 20 seconds one minute later. In my opinion, that should exclude the Ruby process as possible cause. This hypothesis is supported by the fact that during such slow requests, the server doesn't show any load (be it CPU or I/O). It's also independent from the load of other VMs on the host. Seems to be completely random.

Because of the specific setup, there are a lot of possible causes and I don't really know where to start.

Maybe somebody with more experience with the involved components can give me a hint how to approach diagnosis of this issue.

Trufa
  • 123
  • 7
didi_X8
  • 147
  • 1
  • 6
  • 2
    Set up some monitoring and gather relevant statistics to inform your diagnosing. – user9517 Feb 11 '14 at 08:31
  • 1
    apt-get install sysstat, enable it in /etc/defaults/sysstat. Let it gather stats. Learn how to run 'sa' to retrieve all information for a given time period under which the requests was served slowly. Get back to us with that information. – 3molo Feb 11 '14 at 09:11
  • Ok, I've sysstat running now. Didn't use that before. I however also have munin running on that host. As far as I can tell, it shows no relevant load this morning. Avg. load below 1 all the time, CPUs idling, no I/O going on. I'm quite (but not 100%) sure system load is not the problem here. Looks more like a misconfiguration to me. – didi_X8 Feb 11 '14 at 14:56

0 Answers0