0

We have two remote offices which both use the same VOIP provider. During the sign up for this provider, they advised us to change our DNS setup so that the phones use their DNS server for VOIP calls. Both offices are using dnsmasq locally. Here is the DNS portion of the configuration for both offices (they are identical):

# Never forward plain names (without a dot or domain part)
domain-needed
# Never forward addresses in the non-routed address spaces.
bogus-priv

# If you don't want dnsmasq to read /etc/hosts, uncomment the
# following line.
no-hosts
# or if you want it to read another file, as well as /etc/hosts, use
# this.
#addn-hosts=/etc/banner_add_hosts
addn-hosts=/etc/dnsmasq.d/hosts/static.hosts

# Set this (and domain: see below) if you want to have a domain
# automatically added to simple names in a hosts-file.
expand-hosts

# Ensure DNS servers are queried in the order they appear below. That
# will ensure proper georouting for VoIP, and will still work for websites
# on those domains
strict-order

# Our upstream DNS Servers

# 8x8 DNS servers (ensures best georouting for VoIP)
server=/8x8.com/packet8.net/8.28.0.9
server=/8x8.com/packet8.net/192.84.18.11
# 8x8's DNS servers don't handle web domains; fail over to our default servers
server=/8x8.com/packet8.net/#

# default to OpenDNS
server=208.67.222.222
server=208.67.220.220

We ran into one problem when setting up office #1: The VOIP provider's DNS server does not (for some reason) serve their web server host names and so we were unable to connect to their web application. This problem was solved by the addition of the final server=/xxxxx.com... line above. This line provides a "fail over" in cases where the first server does not provide a result. Office #1 is working fine after the addition of this configuration.

We set up office #2 recently, and the only differences between #1 and #2 are the DNS server hardware (#1 = Ubuntu running on x86_64, #2 = Raspberry Pi) and, consequently, a slight difference in the version of dnsmasq (#1 = 2.68, #2 = 2.76). And it has come to light that office #2 is having the same issue we had in office #1.

It is my understanding that, based on the configuration above, a request for sso.8x8.com should go through the following steps:

  1. query 8.28.0.9 for sso.8x8.com
  2. get back the CNAME (see the dig results below)
  3. query 192.84.18.11 for the CNAME
  4. get nothing back
  5. query 208.67.222.222 for the CNAME
  6. get back an IP address

When I dig affected addresses in the two offices, I get:

Office #1:

; <<>> DiG 9.9.5-3ubuntu0.16-Ubuntu <<>> sso.8x8.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 17588
;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sso.8x8.com.           IN  A

;; ANSWER SECTION:
sso.8x8.com.        54  IN  CNAME   sso.8x8.com.cdn.cloudflare.net.
sso.8x8.com.cdn.cloudflare.net. 54 IN   A   104.16.110.61
sso.8x8.com.cdn.cloudflare.net. 54 IN   A   104.16.109.61

;; Query time: 12 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Nov 02 13:39:34 PDT 2017
;; MSG SIZE  rcvd: 116

Office #2:

; <<>> DiG 9.9.5-9+deb8u13-Raspbian <<>> sso.8x8.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 28188
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;sso.8x8.com.           IN  A

;; ANSWER SECTION:
sso.8x8.com.        300 IN  CNAME   sso.8x8.com.cdn.cloudflare.net.

;; Query time: 20 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Nov 02 13:40:34 PDT 2017
;; MSG SIZE  rcvd: 84

I can see in both cases that I get the CNAME, but in office #2 it doesn't appear to follow through and request the IP address (if I'm interpreting this correctly).

So I have a couple of questions:

  1. Is my understanding of the sequence of events involved in a complex DNS query like this correct?

  2. Is there a relevant difference between the two architectures and versions of dnsmasq that is causing this issue? What could be causing this?

  3. Is there a work around to fix this?

Kryten
  • 313
  • 1
  • 2
  • 9

1 Answers1

1

After much searching and playing with settings, I have found the answer.

It turns out that there was a change in the behaviour of dnsmasq in version 2.69. The order in which server directives are evaluated was changed from "top down" to "bottom up". Here are the details from the dnsmasq mailing list.

The solution was simply to reverse the order of the server directives.

Kryten
  • 313
  • 1
  • 2
  • 9