We've had an issue today with the nginx resolver in an AWS environment. According to the nginx log files, nginx did not resolve to the right servers, after the DNS record changes.
The DNS originally resolved:
webservers.ourservers.local A 10.0.0.5
webservers.ourservers.local A 10.0.1.8
... but as soon as one of these IP addresses changed, nginx did not get the change - and continuously tried to reach the outdated ip address - far longer than the 60 seconds ttl.
According to nginx.org/en/docs/http/ngx_http_core_module.html#resolver, the nginx resolver caches answers for the ttl only - or, in our case, for the configured time.
My colleagues moved the resolver
line from the location block to the server block, - but it did not help. We need to restart nginx in order to force it to resolve to the new ip.
Nginx configuration:
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$scheme - $http_x_client_ip - $remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
error_log /var/log/nginx/error.log;
server {
listen 8080 default;
set $OUR_ROOT /var/www/pub;
location /health_check.html {
root $OUR_ROOT;
auth_basic off;
allow all;
}
location / {
# Work around to flush Nginx: http://nginx.org/en/docs/http/ngx_http_core_module.html#resolver
# https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html#AmazonDNS
# This is will be your VPC's IPv4 CIDR value + 2
# You can find this value in /etc/resolv.conf on the EC2 instance
resolver 10.0.0.2 valid=15s;
proxy_pass http://webservers.ourservers.local; # Will be resolved by Route 53
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto https;
proxy_set_header X-Forwarded-Proto-New https;
proxy_set_header X-Forwarded-Port 443;
proxy_set_header Host $host;
proxy_redirect off;
}
}
#include custom config
include /etc/nginx/conf.d/*.conf;
}
We use nginx version 1.13.7.
What did we do wrong? How can we make nginx re-fetch the dns records?
Update/Clarification:
10.0.0.2 is the AWS internal nameserver for VPCs: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_DHCP_Options.html#AmazonDNS . It does resolve correctly when checked via dig
, even when nginx does not resolve correctly at the same time: We have tested these cases with
dig webservers.ourservers.local @10.0.0.2