Having a challenge discovering why results are as they are. Perhaps missing something obvious. Sorry this isn't very specific. But if anyone has an pointers on areas to focus on that would be very helpful. Cheers.
Load Test
Its about 5486 writes to db per minute/90 per sec. I can see the following errors in the logs as the servers become overwhelmed:
- 11: Resource temporarily unavailable) while connecting to upstream
- Upstream timed out (110: Connection timed out) while reading response header from upstream
Issue
When running load test see the following problems:
- The page which performs the update/write (the one the load test is hitting) slows and takes 10 20 secs to load.
- Arbitrary 404's given by Nginx on any of the pages. Results show maybe 10-20% requests during peak period result in 404.
I think they are 2 separate issues and possibly unrelated. I can't see any flat lines in graphs, which would suggest a limit being reached.
- Webservers sits around 60% CPU and remain stable. RAM looks OK.
- Database servers sit around 20% CPU and remain stable. RAM looks OK.
- Database connections go to 1500/2000. This looks iffy. Although it doesn't flat line, which suggests its not hitting a limit.
- Network connection limits appear to be OK.
- Indexed tables where possible/appropriate.
Infrastructure
AWS RDS MySQL 1 x db.m3.xlarge write operations 1 x db.m3.xlarge Replicated DB for read operations
AWS EC2 Webservers Linux, Nginx, PHP-FPM 6 x c3.2xlarge
Config
/etc/php-fpm.d/domain.com.conf
[domain.com]
user = nginx
group = nginx
;;;The address on which to accept FastCGI requests
listen = /var/run/php-fpm/domain.com.sock
;;;A value of '-1' means unlimited. Althought this may be based on ulimit hard limit.
;;;May be worth setting as desired in case of the above.
listen.backlog = -1
;;;dynamic - the number of child processes is set dynamically based on the following directives: pm.max_children, pm.start_servers, pm.min_spare_servers, pm.max_spare_servers.
pm = dynamic
;;;maximum number of child processes to be created when pm is set to dynamic
pm.max_children = 512
;;;The number of child processes created on startup. Used only when pm is set to dynamic.
pm.start_servers = 8
;;;The desired minimum number of idle server processes. Used only when pm is set to dynamic.
pm.min_spare_servers = 2
The desired maximum number of idle server processes. Used only when pm is set to dynamic
pm.max_spare_servers = 16
;;;The number of requests each child process should execute before respawning.
pm.max_requests = 500
;;;The URI to view the FPM status page.
pm.status_path = /status/fpm/domain.com
;;;The timeout for serving a single request. This option should be used when the 'max_execution_time' ini option does not stop script execution
request_terminate_timeout = 30
;;;Set open file descriptor rlimit. Default value: system defined value.
;;;rlimit_files
;;;rlimit_core int
;;;Set max core size rlimit. Possible Values: 'unlimited' or an integer greater or equal to 0. Default value: system defined value.
php_admin_value[post_max_size] = 8M
php_admin_value[upload_max_filesize] = 8M
php_admin_value[disable_functions] = exec,passthru,system,proc_open,popen,show_source
;;; Site specific custom flags go here
;;; End of site specific flags
slowlog = /var/log/nginx/slow-query-$pool.log
request_slowlog_timeout = 10s
chdir = /
Nginx - /etc/nginx/nginx.conf
events {
worker_connections 19000;
# essential for linux, optmized to serve many clients with each thread
use epoll;
multi_accept on;
}
worker_rlimit_nofile 20000;
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format proxy_combined '$http_x_real_ip - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" "$http_user_agent"';
access_log /var/log/nginx/access.log proxy_combined;
sendfile on;
## Start: Size Limits & Buffer Overflows ##
client_body_buffer_size 1K;
client_header_buffer_size 1k;
# client_max_body_size 1k;
large_client_header_buffers 2 1k;
## END: Size Limits & Buffer Overflows ##
## Start: Caching file descriptors ##
open_file_cache max=1000 inactive=20s;
open_file_cache_valid 30s;
open_file_cache_min_uses 2;
open_file_cache_errors on;
## END: Caching
## Start: Timeouts ##
client_body_timeout 10;
client_header_timeout 10;
keepalive_timeout 5 5;
send_timeout 10;
## End: Timeouts ##
server_tokens off;
tcp_nodelay on;
gzip on;
gzip_http_version 1.1;
gzip_vary on;
gzip_comp_level 6;
gzip_proxied any;
gzip_types text/plain text/html text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript text/x-js;
gzip_buffers 16 8k;
gzip_disable "MSIE [1-6]\.(?!.*SV1)";
client_max_body_size 30M;
proxy_cache_path /var/cache/nginx/c2b levels=1:2 keys_zone=c2b-cache:8m max_size=100m inactive=60m;
proxy_temp_path /var/cache/tmp;
proxy_ignore_headers Set-Cookie X-Accel-Expires Expires Cache-Control;
# allow the server to close the connection after a client stops responding. Frees up socket-associated memory.
reset_timedout_connection on;
include /etc/nginx/conf.d/*.conf;
}
NGINX Site specific - /etc/nginx/conf.d/domain.com
# pass the PHP scripts to FastCGI server listening on
location ~ \.php$ {
fastcgi_pass unix:/var/run/php-fpm/domain.com.sock;
fastcgi_index index.php;
fastcgi_param SCRIPT_FILENAME /var/www/domain.com/public_html/$fastcgi_script_name;
include fastcgi_params;
fastcgi_read_timeout 30;
}