5

We've had an issue today with the nginx resolver in an AWS environment. According to the nginx log files, nginx did not resolve to the right servers, after the DNS record changes.

The DNS originally resolved:

webservers.ourservers.local A 10.0.0.5
webservers.ourservers.local A 10.0.1.8

... but as soon as one of these IP addresses changed, nginx did not get the change - and continuously tried to reach the outdated ip address - far longer than the 60 seconds ttl.

According to nginx.org/en/docs/http/ngx_http_core_module.html#resolver, the nginx resolver caches answers for the ttl only - or, in our case, for the configured time.

My colleagues moved the resolver line from the location block to the server block, - but it did not help. We need to restart nginx in order to force it to resolve to the new ip.

Nginx configuration:

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$scheme - $http_x_client_ip - $remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;
    error_log   /var/log/nginx/error.log;

    server {
        listen 8080 default;

        set $OUR_ROOT /var/www/pub;

        location /health_check.html {
            root $OUR_ROOT;
            auth_basic off;
            allow all;
        }

        location / {
            # Work around to flush Nginx: http://nginx.org/en/docs/http/ngx_http_core_module.html#resolver
            # https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html#AmazonDNS
            # This is will be your VPC's IPv4 CIDR value + 2
            # You can find this value in /etc/resolv.conf on the EC2 instance
            resolver 10.0.0.2 valid=15s;

            proxy_pass http://webservers.ourservers.local; # Will be resolved by Route 53
            proxy_set_header X-Real-IP  $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto https;
            proxy_set_header X-Forwarded-Proto-New https;
            proxy_set_header X-Forwarded-Port 443;
            proxy_set_header Host $host;
            proxy_redirect     off;
        }
    }

    #include custom config
    include /etc/nginx/conf.d/*.conf;
}

We use nginx version 1.13.7.

What did we do wrong? How can we make nginx re-fetch the dns records?


Update/Clarification:

10.0.0.2 is the AWS internal nameserver for VPCs: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html#AmazonDNS . It does resolve correctly when checked via dig, even when nginx does not resolve correctly at the same time: We have tested these cases with

dig webservers.ourservers.local @10.0.0.2
hey
  • 327
  • 1
  • 5
  • 14
  • What is `10.0.0.2`? Given nginx's documentation, this is more likely a problem with the upstream DNS server at `10.0.0.2` – Wesley Aug 21 '18 at 05:59
  • 3
    Nginx resolves static hostnames at startup. Put hostname into variable and use it – Alexey Ten Aug 21 '18 at 06:34
  • 2
    See [this answer](https://serverfault.com/questions/240476/how-to-force-nginx-to-resolve-dns-of-a-dynamic-hostname-everytime-when-doing-p/593003#593003). – Richard Smith Aug 21 '18 at 08:39
  • Thank you for your comments! - 10.0.0.2 is the AWS internal DNS server: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html#AmazonDNS – hey Aug 21 '18 at 20:26
  • We have tested resolution via 10.0.0.2 with `dig ... @10.0.0.2`. It works anytime, even in situations where nginx does not resolve correctly. According to http://nginx.org/en/docs/http/ngx_http_core_module.html#resolver, the nginx resolver caches answers for the ttl. – hey Aug 21 '18 at 20:42
  • Thanks, the post @RichardSmith mentioned helped - to understand the variable suggestion alexey-ten mentioned. We replaced the hostname with a variable, and it works! – hey Aug 22 '18 at 00:04

0 Answers0