0

I have nginx with upstream for 4 redis-servers. Sometimes I'm getting such kind of errors (20-30 per minute, only for first & second servers in upstream) in nginx error.log:

... upstream timed out (110: Connection timed out) while connecting to upstream ... upstream: "redis2://AAA.BBB.CCC.DDD:6379" ....

Load average on my redis servers & nginx is <1, all of them - CentOS 6.6; RPS on my nginx - 250-350.

What could be the cause of these errors? Tnx in advance.

nginx.conf

user nginx;
worker_processes  4;
timer_resolution 100ms;
worker_priority -15;
worker_rlimit_nofile 200000;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
  worker_connections  65536;
  use epoll;
  multi_accept on;
}

http {


  include       /etc/nginx/mime.types;
  default_type  application/octet-stream;

  access_log    /var/log/nginx/access.log;

  sendfile on;
  tcp_nopush on;
  tcp_nodelay on;

  keepalive_timeout  65;

  gzip  on;
  gzip_http_version 1.0;
  gzip_comp_level 2;
  gzip_proxied any;
  gzip_vary off;
  gzip_types text/plain text/css application/x-javascript text/xml application/xml application/rss+xml application/atom+xml text/javascript application/javascript application/json text/mathml;
  gzip_min_length  1000;
  gzip_disable     "MSIE [1-6]\.";

  server_names_hash_bucket_size 64;
  types_hash_max_size 2048;
  types_hash_bucket_size 64;

   include /etc/nginx/sites-enabled/*;
}

upstream config:

upstream redis_cluster {
    server redis1.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis2.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis3.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
    server redis4.mydomain.com:6379 max_fails=0 fail_timeout=1s weight=4;
}

sysctl.conf (on nginx, editions only)

net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 20480
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.netfilter.nf_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 15000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5

sysctl.conf (on redis-server, actually, the same, editions only)

vm.overcommit_memory = 1
net.ipv4.ip_local_port_range = 1024 65535
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.secure_redirects = 0
net.ipv4.conf.all.send_redirects = 0
net.ipv4.tcp_max_syn_backlog = 20480
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216
net.netfilter.nf_conntrack_max = 1048576
net.nf_conntrack_max = 1048576
net.ipv4.tcp_congestion_control = htcp
net.ipv4.tcp_no_metrics_save = 1
net.ipv4.tcp_tw_reuse = 1
net.core.somaxconn = 15000
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_keepalive_time = 1800
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_keepalive_probes = 5
d.ansimov
  • 2,131
  • 2
  • 31
  • 54
  • 1
    What is happening the redis server itself? How long does it last for? I would look at redis logs, as well as use a tool to monitor redis. We use icinga with a check_redis script. redis-stat is around, never tried it before. Also a redis 'INFO' command will show you a lot. – chrislovecnm Jan 30 '15 at 08:37
  • 1
    Another option as well is to put http://redis.io/topics/sentinel in place and let it tell you what is going on as well. – chrislovecnm Jan 30 '15 at 08:38

0 Answers0