Currently experiencing slow query responses on a specific interface on my nameserver. I'm running BIND on a physical server with one network card. This network card is leveraged by the interface eth0, and also by the virtual interface eth0:1. They both have an address in the same subnet.
BIND is listening on all IPv4 interfaces, and has some very basic options set. There are no other performance / network related options set in any other included configuration file.
listen-on { any;};
listen-on-v6 port 53 { ::1; };
directory "/var/named";
dump-file "/var/named/data/cache_dump.db";
statistics-file "/var/log/named/named.stats";
memstatistics-file "/var/named/data/named_mem_stats.txt";
When I query against the address on the primary interface eth0, I get a delayed response of around three seconds or above normally. This even applies when querying from the box itself against the address (not the loopback). When querying the other private IP address assigned to the virtual interface eth0:1, no problem with performance is encountered and the response is always under one second.
Analysing performance statistics, it would seem that the box is not under load and memory isn't being maxed out. I've also got another nameserver set up as a slave to this one, on the same network with a near identical network setup bar addressing, and have no performance problems querying it's main interface (with it also having a virtual interface with identical configuration). Zones I'm querying for are authoritative, so there is no delay in looking up the records elsewhere. I'm also able to confirm that the query is received almost instantly by the server regardless of where it originates from, and the delay occurs between the query being received and a response being sent (identified through tcpdump).
If there's any information that would be useful to have, please rather than downvoting me for missing it in my post, please just leave a comment below and I'm happily provide any helpful details I can. Any suggestions on how best to troubleshoot a problem of this nature, or ideas on what the potential causes could be, would be very much appreciated.
BIND version is 9.3.6-P1-RedHat-9.3.6-25.P1.el5_11.11. I've recently updated to this, but I'm unsure whether these performance issues came about following the upgrade, or whether they existed prior to it.
EDIT: Dig output as requested. Removed domain name being queried and target server.
Also worth noting that sometimes the requests do just timeout completely. It's quite intermittent, with occasional replies under two seconds, but mostly over three with the occasional timeout as mentioned.
[root@hugh-host-01 ~]# dig REMOVED @REMOVED
; <<>> DiG 9.9.4-RedHat-9.9.4-38.el7_3 <<>> REMOVED @REMOVED
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52129
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 4
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;REMOVED. IN A
;; ANSWER SECTION:
REMOVED. 5 IN A REMOVED
;; AUTHORITY SECTION:
REMOVED. 5 IN NS REMOVED.
REMOVED. 5 IN NS REMOVED.
REMOVED. 5 IN NS REMOVED.
;; ADDITIONAL SECTION:
REMOVED. 5 IN A REMOVED
REMOVED. 5 IN A REMOVED
REMOVED. 5 IN A REMOVED
;; Query time: 3633 msec
;; SERVER: REMOVED#53(REMOVED)
;; WHEN: Sat Jan 07 00:49:01 GMT 2017
;; MSG SIZE rcvd: 155
Thanks for your time,
Hugh