0

We have ingress-nginx running for a while and about 10% of requests ending up with some SSL handshake problem.

Here is an example of a failing connection:

2019/02/14 10:15:35 [debug] 237#237: *4612 accept: **.**.**.**:40928 fd:53
2019/02/14 10:15:35 [debug] 237#237: *4612 event timer add: 53: 60000:5527050245
2019/02/14 10:15:35 [debug] 237#237: *4612 reusable connection: 1
2019/02/14 10:15:35 [debug] 237#237: *4612 epoll add event: fd:53 op:1 ev:80002001
2019/02/14 10:15:45 [debug] 237#237: *4612 http check ssl handshake
2019/02/14 10:15:45 [debug] 237#237: *4612 http recv(): 0
2019/02/14 10:15:45 [info] 237#237: *4612 client closed connection while SSL handshaking, client: **.**.**.**, server: 0.0.0.0:443
2019/02/14 10:15:45 [debug] 237#237: *4612 close http connection: 53
2019/02/14 10:15:45 [debug] 237#237: *4612 event timer del: 53: 5527050245
2019/02/14 10:15:45 [debug] 237#237: *4612 reusable connection: 0
2019/02/14 10:15:45 [debug] 237#237: *4612 free: 00007F4CC5858E00, unused: 232

10% of failures seems to be quite a lot to expect.

We run ingress-nginx in Kubernetes at Google Cloud. We experience that problem with both 0.18.0 and 0.24.0 versions of quay.io/kubernetes-ingress-controller/nginx-ingress-controller.

The clients are mostly represented by people who are clicking links to our website when browsing internet.

What does that client closed connection while SSL handshaking error may mean in this situation? I really would appreciate any help in this!

Here is nginx configuration:

# Configuration checksum: 4563241843242056760
# setup custom paths that do not require root access pid /tmp/nginx.pid; daemon off; worker_processes 4; worker_rlimit_nofile 261120; worker_shutdown_timeout 10s ; events {
    multi_accept        on;
    worker_connections  16384;
    use                 epoll; } http {

    lua_package_cpath "/usr/local/lib/lua/?.so;/usr/lib/lua-platform-path/lua/5.1/?.so;;";
    lua_package_path "/etc/nginx/lua/?.lua;/etc/nginx/lua/vendor/?.lua;/usr/local/lib/lua/?.lua;;";

    lua_shared_dict configuration_data 5M;
    lua_shared_dict locks 512k;
    lua_shared_dict balancer_ewma 1M;
    lua_shared_dict balancer_ewma_last_touched_at 1M;
    lua_shared_dict sticky_sessions 1M;

    init_by_lua_block {
        require("resty.core")
        collectgarbage("collect")

        local lua_resty_waf = require("resty.waf")
        lua_resty_waf.init()

        -- init modules
        local ok, res

        ok, res = pcall(require, "configuration")
        if not ok then
        error("require failed: " .. tostring(res))
        else
        configuration = res
    configuration.nameservers = { "**.**.**.**" }
        end

        ok, res = pcall(require, "balancer")
        if not ok then
        error("require failed: " .. tostring(res))
        else
        balancer = res
        end

        ok, res = pcall(require, "monitor")
        if not ok then
        error("require failed: " .. tostring(res))
        else
        monitor = res
        end
    }

    init_worker_by_lua_block {
        balancer.init_worker()
    }

    real_ip_header      X-Forwarded-For;

    real_ip_recursive   on;

    set_real_ip_from    0.0.0.0/0;

    geoip_country       /etc/nginx/geoip/GeoIP.dat;
    geoip_city          /etc/nginx/geoip/GeoLiteCity.dat;
    geoip_org           /etc/nginx/geoip/GeoIPASNum.dat;
    geoip_proxy_recursive on;

    aio                 threads;
    aio_write           on;

    tcp_nopush          on;
    tcp_nodelay         on;

    log_subrequest      on;

    reset_timedout_connection on;

    keepalive_timeout  0s;
    keepalive_requests 100;

    client_body_temp_path           /tmp/client-body;
    fastcgi_temp_path               /tmp/fastcgi-temp;
    proxy_temp_path                 /tmp/proxy-temp;
    ajp_temp_path                   /tmp/ajp-temp; 

    client_header_buffer_size       16k;
    client_header_timeout           60s;
    large_client_header_buffers     8 32k;
    client_body_buffer_size         8k;
    client_body_timeout             60s;

    http2_max_field_size            4k;
    http2_max_header_size           16k;

    types_hash_max_size             2048;
    server_names_hash_max_size      1024;
    server_names_hash_bucket_size   64;
    map_hash_bucket_size            64;

    proxy_headers_hash_max_size     512;
    proxy_headers_hash_bucket_size  64;

    variables_hash_bucket_size      128;
    variables_hash_max_size         2048;

    underscores_in_headers          off;
    ignore_invalid_headers          on;

    limit_req_status                503;

    include /etc/nginx/mime.types;
    default_type text/html;

    gzip on;
    gzip_comp_level 5;
    gzip_http_version 1.1;
    gzip_min_length 256;
    gzip_types application/atom+xml application/javascript application/x-javascript application/json application/rss+xml application/vnd.ms-fontobject application/x-font-ttf application/x-web-app-manifest+json application/xhtml+xml application/xml font/opentype image/svg+xml image/x-icon text/css text/plain text/x-component;
    gzip_proxied any;
    gzip_vary on;

    # Custom headers for response

    server_tokens on;

    # disable warnings
    uninitialized_variable_warn off;

    # Additional available variables:
    # $namespace
    # $ingress_name
    # $service_name
    # $service_port
    log_format upstreaminfo '"ingress-nginx","$remote_addr","$remote_user","$time_local","$msec","$request_method","$scheme","$host","$request_uri","$server_protocol",$status,$body_bytes_sent,"$http_referer","$http_user_agent",$request_time,$request_length,$upstream_connect_time,$upstream_header_time,$upstream_response_time,$upstream_response_length,"$proxy_upstream_name",$upstream_status,"$https","$req_id","$namespace","$ingress_name","$service_name","$service_port","$the_real_ip","$geoip_country_name","$geoip_country_code","$geoip_region","$geoip_city","$geoip_latitude","$geoip_longitude"';

    map $request_uri $loggable {

        default 1;
    }

    access_log /usr/share/nginx/access.log upstreaminfo if=$loggable;

    error_log  /var/log/nginx/error.log info;

    resolver **.**.**.** valid=30s;

    # Retain the default nginx handling of requests without a "Connection" header
    map $http_upgrade $connection_upgrade {
        default          upgrade;
        ''               close;
    }

    map $http_x_forwarded_for $the_real_ip {

        default          $remote_addr;

    }

    # trust http_x_forwarded_proto headers correctly indicate ssl offloading
    map $http_x_forwarded_proto $pass_access_scheme {
        default          $http_x_forwarded_proto;
        ''               $scheme;
    }

    # validate $pass_access_scheme and $scheme are http to force a redirect
    map "$scheme:$pass_access_scheme" $redirect_to_https {
        default          0;
        "http:http"      1;
        "https:http"     1;
    }

    map $http_x_forwarded_port $pass_server_port {
        default           $http_x_forwarded_port;
        ''                $server_port;
    }

    map $pass_server_port $pass_port {
        443              443;
        default          $pass_server_port;
    }

    # Obtain best http host
    map $http_host $this_host {
        default          $http_host;
        ''               $host;
    }

    map $http_x_forwarded_host $best_http_host {
        default          $http_x_forwarded_host;
        ''               $this_host;
    }

    # Reverse proxies can detect if a client provides a X-Request-ID header, and pass it on to the backend server.
    # If no such header is provided, it can provide a random value.
    map $http_x_request_id $req_id {
        default   $http_x_request_id;

        ""        $request_id;

    }

    server_name_in_redirect off;
    port_in_redirect        off;

    ssl_protocols SSLv3 TLSv1 TLSv1.1 TLSv1.2;

    # turn on session caching to drastically improve performance

    ssl_session_cache builtin:1000 shared:SSL:10m;
    ssl_session_timeout 10m;

    # allow configuring ssl session tickets
    ssl_session_tickets on;

    # slightly reduce the time-to-first-byte
    ssl_buffer_size 4k;

    # allow configuring custom ssl ciphers
    ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-DSS-AES128-GCM-SHA256:kEDH+AESGCM:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA384:ECDHE-RSA-AES256-SHA:ECDHE-ECDSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-DSS-AES128-SHA256:DHE-RSA-AES256-SHA256:DHE-DSS-AES256-SHA:DHE-RSA-AES256-SHA:ECDHE-RSA-DES-CBC3-SHA:ECDHE-ECDSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:AES:DES-CBC3-SHA:HIGH:SEED:!aNULL:!eNULL:!EXPORT:!RC4:!MD5:!PSK:!RSAPSK:!aDH:!aECDH:!EDH-DSS-DES-CBC3-SHA:!KRB5-DES-CBC3-SHA:!SRP';
    ssl_prefer_server_ciphers on;

    ssl_ecdh_curve auto;

    proxy_ssl_session_reuse on;

    upstream upstream_balancer {
        server 0.0.0.1; # placeholder

        balancer_by_lua_block {
            balancer.balance()
        }

        keepalive 32;

    }

    ## start server *.***.com
    server {
        server_name *.***.com ;

        listen 80;

        listen [::]:80;

        set $proxy_upstream_name "-";

        listen 443  ssl http2;

        listen [::]:443  ssl http2;

        # PEM sha: 9cc55357f33919dea89b125fda32ccf33d554131
        ssl_certificate                         /etc/ingress-controller/ssl/default-wildcard-***-com-tls-secret.pem;
        ssl_certificate_key                     /etc/ingress-controller/ssl/default-wildcard-***-com-tls-secret.pem;

        location / {

            set $namespace      "default";
            set $ingress_name   "***-com";
            set $service_name   "***";
            set $service_port   "0";
            set $location_path  "/";

            rewrite_by_lua_block {

                balancer.rewrite()

            }

            log_by_lua_block {

                balancer.log()

                monitor.call()
            }

            if ($scheme = https) {
                more_set_headers                        "Strict-Transport-Security: max-age=15724800; includeSubDomains";
            }

            port_in_redirect off;

            set $proxy_upstream_name "***-8080";

            client_max_body_size                    "1m";

            proxy_set_header Host                   $best_http_host;

            # Pass the extracted client certificate to the backend

            # Allow websocket connections
            proxy_set_header                        Upgrade           $http_upgrade;

            proxy_set_header                        Connection        $connection_upgrade;

            proxy_set_header X-Request-ID           $req_id;
            proxy_set_header X-Real-IP              $the_real_ip;

            proxy_set_header X-Forwarded-For        $the_real_ip;

            proxy_set_header X-Forwarded-Host       $best_http_host;
            proxy_set_header X-Forwarded-Port       $pass_port;
            proxy_set_header X-Forwarded-Proto      $pass_access_scheme;

            proxy_set_header X-Original-URI         $request_uri;

            proxy_set_header X-Scheme               $pass_access_scheme;

            # Pass the original X-Forwarded-For
            proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;

            # mitigate HTTPoxy Vulnerability
            # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
            proxy_set_header Proxy                  "";

            # Custom headers to proxied server

            proxy_connect_timeout                   5s;
            proxy_send_timeout                      60s;
            proxy_read_timeout                      60s;

            proxy_buffering                         "off";
            proxy_buffer_size                       "4k";
            proxy_buffers                           4 "4k";
            proxy_request_buffering                 "on";

            proxy_http_version                      1.1;

            proxy_cookie_domain                     off;
            proxy_cookie_path                       off;

            # In case of errors try the next upstream server before returning an error
            proxy_next_upstream                     error timeout;
            proxy_next_upstream_tries               3;

            proxy_pass http://upstream_balancer;

            proxy_redirect                          off;

        }

    }
    ## end server *.***.com

    ## start server _
    server {
        server_name _ ;

        listen 80 default_server reuseport backlog=511;

        listen [::]:80 default_server reuseport backlog=511;

        set $proxy_upstream_name "-";

        listen 443  default_server reuseport backlog=511 ssl http2;

        listen [::]:443  default_server reuseport backlog=511 ssl http2;

        # PEM sha: 9cc55357f33919dea89b125fda32ccf33d554131
        ssl_certificate                         /etc/ingress-controller/ssl/default-wildcard-***-com-tls-secret.pem;
        ssl_certificate_key                     /etc/ingress-controller/ssl/default-wildcard-***-com-tls-secret.pem;

        location / {

            set $namespace      "";
            set $ingress_name   "";
            set $service_name   "";
            set $service_port   "0";
            set $location_path  "/";

            rewrite_by_lua_block {

                balancer.rewrite()

            }

            log_by_lua_block {

                balancer.log()

                monitor.call()
            }

            if ($scheme = https) {
                more_set_headers                        "Strict-Transport-Security: max-age=15724800; includeSubDomains";
            }

            access_log off;

            port_in_redirect off;

            set $proxy_upstream_name "upstream-default-backend";

            client_max_body_size                    "1m";

            proxy_set_header Host                   $best_http_host;

            # Pass the extracted client certificate to the backend

            # Allow websocket connections
            proxy_set_header                        Upgrade           $http_upgrade;

            proxy_set_header                        Connection        $connection_upgrade;

            proxy_set_header X-Request-ID           $req_id;
            proxy_set_header X-Real-IP              $the_real_ip;

            proxy_set_header X-Forwarded-For        $the_real_ip;

            proxy_set_header X-Forwarded-Host       $best_http_host;
            proxy_set_header X-Forwarded-Port       $pass_port;
            proxy_set_header X-Forwarded-Proto      $pass_access_scheme;

            proxy_set_header X-Original-URI         $request_uri;

            proxy_set_header X-Scheme               $pass_access_scheme;

            # Pass the original X-Forwarded-For
            proxy_set_header X-Original-Forwarded-For $http_x_forwarded_for;

            # mitigate HTTPoxy Vulnerability
            # https://www.nginx.com/blog/mitigating-the-httpoxy-vulnerability-with-nginx/
            proxy_set_header Proxy                  "";

            # Custom headers to proxied server

            proxy_connect_timeout                   5s;
            proxy_send_timeout                      60s;
            proxy_read_timeout                      60s;

            proxy_buffering                         "off";
            proxy_buffer_size                       "4k";
            proxy_buffers                           4 "4k";
            proxy_request_buffering                 "on";

            proxy_http_version                      1.1;

            proxy_cookie_domain                     off;
            proxy_cookie_path                       off;

            # In case of errors try the next upstream server before returning an error
            proxy_next_upstream                     error timeout;
            proxy_next_upstream_tries               3;

            proxy_pass http://upstream_balancer;

            proxy_redirect                          off;

        }

        # health checks in cloud providers require the use of port 80
        location /healthz {

            access_log off;
            return 200;
        }

        # this is required to avoid error if nginx is being monitored
        # with an external software (like sysdig)
        location /nginx_status {

            allow 127.0.0.1;

            allow ::1;

            deny all;

            access_log off;
            stub_status on;
        }

    }
    ## end server _

    # default server, used for NGINX healthcheck and access to nginx stats
    server {
        listen 18080 default_server reuseport backlog=511;
        listen [::]:18080 default_server reuseport backlog=511;
        set $proxy_upstream_name "-";

        location /healthz {

            access_log off;
            return 200;
        }

        location /is-dynamic-lb-initialized {

            access_log off;

            content_by_lua_block {
                local configuration = require("configuration")
                local backend_data = configuration.get_backends_data()
                if not backend_data then
                ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
                return
                end

                ngx.say("OK")
                ngx.exit(ngx.HTTP_OK)
            }
        }

        location /nginx_status {
            set $proxy_upstream_name "internal";

            access_log off;
            stub_status on;
        }

        location /configuration {
            access_log off;

            allow 127.0.0.1;

            allow ::1;

            deny all;

            # this should be equals to configuration_data dict
            client_max_body_size                    "10m";
            proxy_buffering                         off;

            content_by_lua_block {
                configuration.call()
            }
        }

        location / {

            set $proxy_upstream_name "upstream-default-backend";

            proxy_pass          http://upstream_balancer;

        }

    } } stream {
    log_format log_stream [$time_local] $protocol $status $bytes_sent $bytes_received $session_time;

    access_log /usr/share/nginx/access.log log_stream;

    error_log  /var/log/nginx/error.log;

    # TCP services

    # UDP services
     }
  • Welcome on ServerFault! Please read the "How to ask" FAQ: https://serverfault.com/help/how-to-ask Pay attention to the "Ask a question" section, then come back and edit your... statement :) – NoMad Feb 14 '19 at 17:33
  • @sergey-vlassiev, Could you, please, share your nginx configuration? It's hard to tell you what's going on w/o understanding what is the client. – max Feb 15 '19 at 02:19
  • @max thanks. I've added nginx configuration. Unfortunately, I'm not an expert in that topic, so I'm not sure what else may be helpful. Clients are typically people who click link leading to our endpoint from the other websites. – Sergey Vlassiev Feb 15 '19 at 09:52
  • @SergeyVlassiev try to test your ssl connection with https://www.ssllabs.com/ssltest/ – c4f4t0r Feb 15 '19 at 10:48
  • @c4f4t0r thanks for advise. Tests show A+ overall rating, 100% of certificate coverage, 95% protocol support and 90% Key Exchange and Cipher Strength. Nothing suspicious from my point of view. Maybe you know what points of that check may be most important in terms of my problem? Moreover, most of the requests are handled without problems, but about 10% of SSL connections just closed during handshake by client. – Sergey Vlassiev Feb 15 '19 at 11:38
  • maybe this 10% failed is a type of client using an specific cipher, try to increase the nginx log – c4f4t0r Feb 15 '19 at 11:47
  • @c4f4t0r thanks, that's a good idea. But usually, when the client and server don't share ciphers we get another error: **SSL_do_handshake() failed … no shared cipher**. In my case we have a different one. According to the [nginx doc](http://nginx.org/en/docs/ngx_core_module.html#error_log) we already use the most verbose logs. Looks like that particular error just doesn't have enough data even in debug mode. Maybe you know any other ways to increase verbosity of the logs for ingress-nginx? – Sergey Vlassiev Feb 15 '19 at 12:48
  • @SergeyVlassiev try to make a tcpdump on the nodeport of the ingress service and check if you capture the error, you can analyze the tcpdump file with wireshark – c4f4t0r Feb 15 '19 at 12:59

0 Answers0