0

For the name resolution in my network I use named/bind based on openSuse Leap 15.2. On that server I have two IP addresses configured. The one for the server itself - e.g. 192.168.3.150 - and a second one for the DNS - e.g. 192.168.3.200.

If I send DNS-Queries to the IP 192.168.3.150, all Queries will be answered. Sending Queries to the IP 192.168.3.200, some of them become answered, but most of them not. The DNS-Client like nslookup or dig runs into timeouts.

I have increased my debug level and what I see is the following:

17-Mar-2021 22:44:06.079 client: debug 3: client @0x7f063000b180 127.0.0.1#55255: UDP request
17-Mar-2021 22:44:06.079 client: debug 5: client @0x7f063000b180 127.0.0.1#55255: using view '_default'
17-Mar-2021 22:44:06.079 security: debug 3: client @0x7f063000b180 127.0.0.1#55255: request is not signed
17-Mar-2021 22:44:06.079 security: debug 3: client @0x7f063000b180 127.0.0.1#55255: recursion available
17-Mar-2021 22:44:06.079 security: debug 3: client @0x7f063000b180 127.0.0.1#55255 (my.host.domain.de): query 'my.host.domain.de/A/IN' approved
17-Mar-2021 22:44:06.079 security: debug 3: client @0x7f0630007440 127.0.0.1#35797 (my.host.domain.de): reset client
17-Mar-2021 22:44:06.079 security: debug 3: client @0x7f063000b180 127.0.0.1#55255 (my.host.domain.de): reset client

My named-config and examples are attached below.

/etc/named.conf

options {
    directory "/var/lib/named";
    managed-keys-directory "/var/lib/named/dyn/";
    dump-file "/var/log/named_dump.db";
    statistics-file "/var/log/named.stats";
    forwarders { xxx.xxx.xxx.xxx; };
    listen-on port 53 { 127.0.0.1; 192.168.3.150; 192.168.3.200; };
    listen-on-v6 { none; };
    query-source address 192.168.3.200 port *;
    transfer-source 192.168.3.200 port 53;
    allow-query { 127.0.0.1; 192.168.x.0/24; 192.168.x.0/24; 192.168.x.0/24; 192.168.x.0/24; 192.168.x.0/24; };
    notify no;
    disable-empty-zone "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.IP6.ARPA";
    allow-transfer { localhost; 192.168.x.170; };
    recursion yes;
};

logging {
    channel default_file {
        file "/var/log/named.log" size 10m;
        severity dynamic;
        print-time yes;
        print-severity yes;
        print-category yes;
    };
    category default{ default_file; };
};

zone "." in {
    type hint;
    file "root.hint";
};

zone "localhost" in {
    type master;
    file "localhost.zone";
};

zone "0.0.127.in-addr.arpa" in {
    type master;
    file "127.0.0.zone";
};

include "/etc/bind/zones.conf";

Any idea why named resets the client?

  • I'm curious - do the queries work if you try them from another machine on the 192.168.3.x subnet? And fail if trying from one of the OTHER 192.168.x.0 (obfuscating private subnets? really???) subnets? – Brandon Xavier Mar 19 '21 at 19:40
  • The queries fail always. It doesn't matter whether they come from the same or a different subnet. – Jonathan Mar 19 '21 at 21:01
  • Apologies for nitpicking, but do the queries "fail always" or "some of them become answered, but most of them not"? This is significant because this seems more like a network issue than a BIND issue. Most likely the router/firewall is sending a request to .200, BIND is replying, and server is sending the reply back thru .150. The r/f is NOT expecting a reply from .150 and drops it. In the case of queries to .150, the r/f IS expecting a reply from .150 and traffic flows normally. You can easily fix this by SNATting the DNS replies, but you need to prove this is the problem first. – Brandon Xavier Mar 20 '21 at 05:48
  • (continued) A short packet capture on the BIND server should be able to easily prove/disprove this theory. Or possibly forcing a TCP connection on dig with `dig +tcp . . . ` (TCP will have have a specific connection to talk to, whereas stateless UDP just gets dropped in the stack to be delivered). Or careful observation (see original comment). On a side note: From a sysadmin point of view, I've never found turning up the debug level on BIND to be that useful - the non-debug level messages usually have plenty of info -- YMMV. – Brandon Xavier Mar 20 '21 at 06:03
  • Thanks for your answer. I think I expressed myself in a misleading way. A few queries are answered, but most are not. It makes no difference from which subnet the queries come from. No matter whether from the subnet 192.168.3.x or another. I think I can rule out a network problem. There is nothing to be seen in this regard in my firewall logs. I have already run a tcpdump on the DNS server. You can only see that the queries are received from the clients, but no response is sent from the server. – Jonathan Mar 23 '21 at 16:11
  • Sorry to keep harping on this from the network POV (the reason I am is because I routinely deal with a nearly identical scenario with some RADIUS clusters I manage: UDP application, server has multiple IPs on same subnet (a local and a VIP), traffic is NAT'ed from a firewall, and if traffic doesn't go back out on the same IP it came in on, it gets dropped). Having said that, have you tried the `dig` with the `+tcp` option? – Brandon Xavier Mar 23 '21 at 19:20
  • When I run dig with the +tcp option, all of the queries work. But there is no firewall and no NAT between the two servers I am testing with. They are in the same subnet. – Jonathan Mar 24 '21 at 14:20
  • Don't forget the firewall on the server itself. Same caveats apply to it as a network firewall/router. – Brandon Xavier Mar 24 '21 at 15:16
  • Anyway, the `dig +tcp` has convinced me this is a network issue. This iptables command (I'm not familiar with SUSE's firewall implementations) _should_ help: `iptables -A POSTROUTING -s 192.168.3.150/32 -p udp -m udp --sport 53 -j SNAT --to-source 192.168.3.200:53` Warning: That command *will* break your queries from .150. You probably want to either refine it to be more selective, or stop doing queries on .150 – Brandon Xavier Mar 24 '21 at 15:29

0 Answers0