4

I set up a set of white-label nameservers at AWS Route53, following both the AWS docs as well as a write-up I found from an AWS employee. All seemed to work fine -- I have never had an issue reaching a test site I use, with those white-label nameservers as its DNS.

I asked a few people around the USA to try getting to the site, and they said they were intermittently getting "server not found". I tried examining via online tools like pingdom -- and got some pretty disconcerting error messages back, but with the final statement "but an IP address lookup succeeded in spite of that". Not a real warm fuzzy.

If anyone would mind helping a DNS newbie out a bit. The test site is at: tsf-test.com ; if reachable it should give a brilliant "error establishing a database connection" result. That is expected.

DNSstuf.com reports this:

SOA record check    No nameservers provided an SOA record for the zone. 
You should configure your nameservers to have a master slave relationship.
The update of the zone information to the slave nameservers should 
be handled through the SOA record.

I would really appreciate any pointers about how to get this working solidly at Route53. Thanks...

EDIT:

Here are the records I have set up at AWS for the custom nameservers:

localroute.net.  NS ns1.localroute.net 172800
                    ns2.localroute.net 
                    ns3.localroute.net 
                    ns4.localroute.net

localroute.net.  SOA  ns1.localroute.net. hostmaster.localroute.net. 2016112702 7200 900 1209600 86400  900

ns1.localroute.net. A  205.251.192.207 172800
ns2.localroute.net. A  205.251.197.175 172800
ns3.localroute.net. A  205.251.195.235 172800
ns4.localroute.net. A  205.251.198.34  172800

Also, for the domain localroute.net, with AWS as registrar, I have glue records for ns1. ns2. ns3. and ns4. - pointing to same IP addresses as above.

Then, for tsf-test.com here are the zone records:

tsf-test.com.   A   xxx.xxx.xxx.xxx 60

tsf-test.com.   NS  ns1.localroute.net 60
                    ns2.localroute.net 
                    ns3.localroute.net 
                    ns4.localroute.net
tsf-test.com.   SOA ns1.localroute.net. hostmaster.localroute.net. 2016112701 7200 900 1209600 86400    900

*.tsf-test.com. CNAME tsf-test.com  60
C C
  • 423
  • 1
  • 4
  • 16
  • Give us the affected domain name, please. – ceejayoz Dec 13 '16 at 22:04
  • @ceejayoz thank you...and yes,here is the test site I use: tsf-test.com. My vanity nameservers are ns1, ns2, ns3 and ns4.localroute.net – C C Dec 13 '16 at 22:08
  • 1
    When I do `dig @ns1.localroute.net tsf-test.com` I get `dig: couldn't get address for 'ns1.localroute.net': not found`. I can't get an IP for your custom nameservers. I'd guess you're missing [glue records](http://serverfault.com/questions/309622/what-is-a-glue-record) but I can't say I've ever had to set those up. – ceejayoz Dec 13 '16 at 22:20
  • 1
    You have something very strange going on http://www.dnsstuff.com/tools#dnsReport|type=domain&&value=tsf-test.com says you don't have SOA records but I can look them up else where. Elsewhere I you ns1.... doesn't resolve, but other places it does. I suspect it is SOA related, perhaps there is a replication/conflict across the nameservers – Drifter104 Dec 13 '16 at 22:28
  • Drifter104 I think you are right but I am not smart enough to know what the problem is. I edited the question to provide all the information I have. @ceejay0z - I saw that dig can't get those IP addresses for ns1..ns4 -- but I don't know why. – C C Dec 13 '16 at 23:26
  • I added a CNAME for `*.localroute.net` pointing to `localroute.net`. Now `dig @ns1.localroute.net tsf-test.com` seems to return a valid response. Does this mean the problem is solved? – C C Dec 14 '16 at 00:19
  • Seems to be working for me (southeast US). Also tested across the globe with https://www.whatsmydns.net – Linuxx Dec 14 '16 at 01:21
  • thanks. The site seams reachable from worldwide test points...but it is bugging the heck out of me that I cannot get an authoritative SOA response from `nslookup` or any of the online tools. I think I'll open a new question that zeroes in on that exact issue. – C C Dec 14 '16 at 01:28
  • 2
    I have an answer coming up, no new question necessary. – Michael - sqlbot Dec 14 '16 at 01:28
  • yessir...and thank you in advance; I am twitching over here trying to wrap my head around this. – C C Dec 14 '16 at 01:33

1 Answers1

6

Your glue records are correct. That isn't the issue.

The problem is, you didn't configure the localroute.net domain, itself, to actually use the white label servers, even though you've configured the global authoritative servers at the top level of the .net domain (the gTLD servers) -- via the registrar -- to believe that you did.

If you open up the Route 53 console, and highlight that hosted zone for localroute.net (don't click on the actual domain name, just on the row in the table), I believe you'll find that the 4 nameservers listed on the right side of the screen are not the correct Route 53 servers -- they don't correspond to the same 4 IP addresses that match your 4 white label servers. They should be these:

ns-1455.awsdns-53.org.   205.251.197.175
ns-1003.awsdns-61.net.   205.251.195.235
ns-207.awsdns-25.com.    205.251.192.207
ns-1570.awsdns-04.co.uk. 205.251.198.34

The localroute.net hosted zone -- I suspect you'll find -- will not be using these, but tsf-test.com will be, because it's correct.

But those are the equivalent servers for ns1-ns4, which allegedly is authoritative for localroute.net... yet if you ask those specific Route 53 servers about localroute.net, they have no idea what you're talking about.

;; ->>HEADER<<- opcode: QUERY, status: REFUSED, id: 21284
                                       ^^^^^^^

So, assuming the above is accurate, how do you fix it?

The NS records in the hosted zone itself are irrelevant if values corresponding to different IP addresses are shown here.

I'm guessing you may have edited the NS records for the localroute.net zone, and you can't arbitrarily do that. It doesn't work that way. The hosted zone must already be on those name servers, or the change doesn't accomplish anything useful.

You'll need to create a new hosted zone for localroute.net using the same process that you used to create tsf-test.com -- so that it is associated with the white label name servers. You don't have to delete the old one first -- you can delete it later. Create the zone, populate the records, and the issue should be resolved.

Michael - sqlbot
  • 22,658
  • 2
  • 63
  • 86
  • 3
    Michael. I am absolutely stunned. You, sir, are a phenom. You nailed this...exactly right. I believe what happened is when I registered the domain `localroute.net`, at AWS - it automatically created a hosted zone, which I happily went in and edited. Never realizing it was in a separate delegation set. I did exactly as you said, re-created the hosted zone from the AWS CLI, using the shared delegation set...and now I believe it is correct. Thank you so much for taking the time to look at this. – C C Dec 14 '16 at 02:39
  • Not actually the root servers, those only have delegation of the TLDs, but the nameservers for the net tld (in this case). – Håkan Lindqvist Dec 14 '16 at 07:41
  • @HåkanLindqvist, you're absolutely correct. I used the phrase "root servers" in a very imprecise way. It is indeed the layer of servers one level down from the actual root servers. Updated. – Michael - sqlbot Dec 14 '16 at 12:47