4

I'm running two machines with PowerDNS, one being the master (SQL) and one being the slave (Bind backend).

After I modify a domain and bump the serial, I get this in the log:

Sep 30 22:13:20 localhost pdns[6884]: 1 domain for which we are master needs notifications
Sep 30 22:13:20 localhost pdns[6884]: Queued notification of domain 'netly.io' to 146.185.146.149
Sep 30 22:13:20 localhost pdns[6884]: Queued notification of domain 'netly.io' to 146.185.147.74
Sep 30 22:13:20 localhost pdns[6884]: Received NOTIFY for netly.io from 146.185.146.149 but slave support is disabled in the configuration
Sep 30 22:13:21 localhost pdns[6884]: Received unsuccessful notification report for 'netly.io' from 146.185.146.149:53, rcode: 4
Sep 30 22:13:21 localhost pdns[6884]: Removed from notification list: 'netly.io' to 146.185.146.149:53
Sep 30 22:13:23 localhost pdns[6884]: No master domains need notifications

I understand it's notifying itself (146.185.146.149) because it is set as nameserver, and that those errors can be ignored. It (looks like) notifies the other server (146.185.147.74 or 162.243.29.199) as well.

However, the slave doesn't show anything in the log around that time frame, and when I cat the domain file, I can see the old serial and the subdomain not being updated.

dig @slave-server also shows the old settings.

telling it to reload also doesn't update the bind zone file:

slave-server # pdns_control reload
Ok
slave-server # tail -f /var/log/daemon.log 
Sep 30 22:21:28 node-e31401 pdns[2259]: Zone 'netly.io' (/etc/powerdns/bind/netly.io.) needs reloading
Sep 30 22:21:28 node-e31401 pdns[2259]: Zone 'netly.io' (/etc/powerdns/bind/netly.io.) reloaded

However, when I entirely restart PDNS it finally figures out it is outdated and correctly fetches the updated zone:

slave-server # /etc/init.d/pdns restart
[ ok ] Restarting PowerDNS Authoritative Name Server: pdns.
slave-server # tail -f /var/log/daemon.log 
Sep 30 22:23:48 node-e31401 pdns[2911]: 2 slave domains need checking, 0 queued for AXFR
Sep 30 22:23:48 node-e31401 pdns[2911]: Received serial number updates for 2 zones, had 0 timeouts
Sep 30 22:23:48 node-e31401 pdns[2911]: Domain netly.io is stale, master serial 2013093004, our serial 2013093003
Sep 30 22:23:48 node-e31401 pdns[2911]: Domain titify.com is fresh (not presigned, no RRSIG check)
Sep 30 22:23:48 node-e31401 pdns[2911]: No master domains need notifications
Sep 30 22:23:48 node-e31401 pdns[2911]: Initiating transfer of 'netly.io' from remote '146.185.146.149'
Sep 30 22:23:48 node-e31401 pdns[2911]: AXFR started for 'netly.io', transaction started
Sep 30 22:23:48 node-e31401 pdns[2911]: Zone 'netly.io' (/etc/powerdns/bind/netly.io.) reloaded
Sep 30 22:23:48 node-e31401 pdns[2911]: AXFR done for 'netly.io', zone committed with serial number 2013093004
Sep 30 22:23:48 node-e31401 pdns[2911]: Done launching threads, ready to distribute questions

What am I missing here? What is causing the master to correctly notify the slave, but the slave not to fetch the new zone?

Edit:

tcpdump:

node-fd1d01 ~ # tcpdump -n 'host 146.185.146.149 and port 53'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
09:51:38.042713 IP 146.185.146.149.42478 > 162.243.29.199.53: 61745 notify [b2&3=0x2400] SOA? netly.io. (26)
09:51:41.043323 IP 146.185.146.149.42478 > 162.243.29.199.53: 61745 notify [b2&3=0x2400] SOA? netly.io. (26)
09:51:46.044145 IP 146.185.146.149.42478 > 162.243.29.199.53: 61745 notify [b2&3=0x2400] SOA? netly.io. (26)
09:51:52.049533 IP 146.185.146.149.42478 > 162.243.29.199.53: 59408 notify [b2&3=0x2400] SOA? netly.io. (26)
09:51:55.050715 IP 146.185.146.149.42478 > 162.243.29.199.53: 61745 notify [b2&3=0x2400] SOA? netly.io. (26)
09:51:55.050753 IP 146.185.146.149.42478 > 162.243.29.199.53: 59408 notify [b2&3=0x2400] SOA? netly.io. (26)
09:52:00.053327 IP 146.185.146.149.42478 > 162.243.29.199.53: 59408 notify [b2&3=0x2400] SOA? netly.io. (26)
09:52:09.056321 IP 146.185.146.149.42478 > 162.243.29.199.53: 59408 notify [b2&3=0x2400] SOA? netly.io. (26)

Log doesn't show anything new (latest at 09h48):

node-fd1d01 /etc/powerdns/bind # tail -f /var/log/daemon.log 
Oct  2 09:47:59 localhost pdns[2253]: Domain netly.io is fresh (not presigned, no RRSIG check)
Oct  2 09:47:59 localhost pdns[2253]: Domain titify.com is fresh (not presigned, no RRSIG check)
Oct  2 09:47:59 localhost pdns[2253]: No master domains need notifications
Oct  2 09:47:59 localhost pdns[2253]: Done launching threads, ready to distribute questions
Oct  2 09:48:00 localhost ntpd[2144]: Listen normally on 6 tun0 172.17.24.1 UDP 123
Oct  2 09:48:00 localhost ntpd[2144]: Listen normally on 7 tun1 172.17.16.1 UDP 123
Oct  2 09:48:00 localhost ntpd[2144]: peers refreshed
Oct  2 09:48:12 localhost dbus[2093]: [system] Activating service name='org.freedesktop.ConsoleKit' (using servicehelper)
Oct  2 09:48:12 localhost dbus[2093]: [system] Successfully activated service 'org.freedesktop.ConsoleKit'
Oct  2 09:48:59 localhost pdns[2253]: No new unfresh slave domains, 0 queued for AXFR already

But when I cat the zone file (in Bind format) it's not updated.

Tuinslak
  • 1,465
  • 8
  • 32
  • 56
  • The initial post got updated with the configs. – Tuinslak Oct 01 '13 at 18:21
  • Make sure the notify from master to slave isn't blocked by any firewall. – Stefan Oct 02 '13 at 07:34
  • TCP/53, right? Those ports are open. – Tuinslak Oct 02 '13 at 07:56
  • 1
    Notify uses UDP/53 (master: random port, slave: port 53). you could watch with `tcpdump -n 'host 146.185.146.149 and port 53'` on your slave, and trigger `pdns_control notify netly.io` on the master. – Stefan Oct 02 '13 at 08:28
  • I've updated the initial post. I can see the notifications, but the zone files do not get updated. – Tuinslak Oct 02 '13 at 09:56
  • 1
    Your secondaries (*.titify.com) seem to be entirely unreachable from the Internet. Not sure this is causing your problem, but it certainly doesn't help. Also makes it hard to debug from the outside. – Habbie Oct 03 '13 at 10:07
  • It's a droplet from DigitalOcean. I've booted it again now. – Tuinslak Oct 03 '13 at 18:52
  • Actually, thank you. I figured it out... I'll reply my own answer. – Tuinslak Oct 03 '13 at 18:57

3 Answers3

2

We were experiencing this and it turns out that the target of the DNS notification message was actually refusing the message.

Notice the "notify Refused" below. Substituted fake server and zone names.

    # tcpdump -v -r notify.pcap
reading from file notify.pcap, link-type LINUX_SLL (Linux cooked)
00:00:33.210137 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 61) master.dns.server.46861 > slave.dns.server.domain: 49437 notify SOA? zoneinquestion.com. (33)
00:00:33.236488 IP (tos 0x0, ttl 55, id 17352, offset 0, flags [none], proto UDP (17), length 61) slave.dns.server.domain > master.dns.server.46861: 49437 notify Refused- 0/0/0 (33)
00:00:36.244057 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 61) master.dns.server.46861 > slave.dns.server.domain: 48449 notify SOA? zoneinquestion.com. (33)
00:00:36.269682 IP (tos 0x0, ttl 55, id 17353, offset 0, flags [none], proto UDP (17), length 61) slave.dns.server.domain > master.dns.server.46861: 48449 notify Refused- 0/0/0 (33)
00:00:36.519361 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto UDP (17), length 61) master.dns.server.46861 > slave.dns.server.domain: 65128 notify SOA? zoneinquestion.com. (33)
00:00:36.544391 IP (tos 0x0, ttl 55, id 17354, offset 0, flags [none], proto UDP (17), length 61) slave.dns.server.domain > master.dns.server.46861: 65128 notify Refused- 0/0/0 (33)

Captured this output on the master with the following:

tcpdump -U -i any -w notify.pcap -s 1600 host slave.dns.server
lance.johnsn
  • 121
  • 3
1

The problem was port 53 being firewalled from the outside port, but not on the localhost or on the VPN interface. I hadn't noticed because I usually tried dig @localhost.

If I understand correctly, master sends a message to UDP/53 (via Stefan). This was thus partially firewalled and caused the problem.

Master:

Oct  3 18:56:25 localhost pdns[6884]: gmysql Connection successful
Oct  3 18:56:25 localhost pdns[6884]: AXFR of domain 'netly.io' initiated by 162.243.25.159
Oct  3 18:56:25 localhost pdns[6884]: AXFR of domain 'netly.io' allowed: client IP 162.243.25.159 is in allow-axfr-ips
Oct  3 18:56:25 localhost pdns[6884]: gmysql Connection successful
Oct  3 18:56:25 localhost pdns[6884]: gmysql Connection successful
Oct  3 18:56:25 localhost pdns[6884]: AXFR of domain 'netly.io' to 162.243.25.159 finished
Oct  3 18:56:25 localhost pdns[6884]: Received unsuccessful notification report for 'netly.io' from 146.185.146.149:53, rcode: 4
Oct  3 18:56:25 localhost pdns[6884]: Removed from notification list: 'netly.io' to 146.185.146.149:53
Oct  3 18:56:25 localhost pdns[6884]: Removed from notification list: 'netly.io' to 162.243.25.159:53 (was acknowledged)
Oct  3 18:56:27 localhost pdns[6884]: No master domains need notifications

Slave:

Oct  3 18:56:25 localhost pdns[2263]: 1 slave domain needs checking, 0 queued for AXFR
Oct  3 18:56:25 localhost pdns[2263]: Received serial number updates for 1 zones, had 0 timeouts
Oct  3 18:56:25 localhost pdns[2263]: Domain netly.io is stale, master serial 2013100302, our serial 2013100301
Oct  3 18:56:25 localhost pdns[2263]: Initiating transfer of 'netly.io' from remote '146.185.146.149'
Oct  3 18:56:25 localhost pdns[2263]: AXFR started for 'netly.io', transaction started
Oct  3 18:56:25 localhost pdns[2263]: Zone 'netly.io' (/etc/powerdns/bind/netly.io.) reloaded
Oct  3 18:56:25 localhost pdns[2263]: AXFR done for 'netly.io', zone committed with serial number 2013100302
Tuinslak
  • 1,465
  • 8
  • 32
  • 56
0

don't forget to increase your serial. a AXFR notify does nothing if you haven't increased the serial on the master

c33s
  • 1,515
  • 3
  • 21
  • 39