Our company uses Google Cloud Platform for hosting and DNS (we're not using GCP as a registrar). Recently we were looking at some metrics and saw slow DNS resolution, often over 100ms, was a significant contributor to our overall page load times. We setup a Datadog DNS synthetic and DNS resolution seems consistently slower from the west coast. The synthetic tests are run from within AWS and use 8.8.8.8 (Google Public DNS) and I see the same behavior by pointing the tests to 1.1.1.1 (Cloudflare DNS).
Ohio & Oregon: https://i.stack.imgur.com/5Sr8o.png
Virginia & California: https://i.stack.imgur.com/WSc8f.png
Also used DNS Perf: https://i.stack.imgur.com/fPLRw.jpg
The NS records given to us by Google Cloud are:
ns-cloud-b1.googledomains.com.
ns-cloud-b2.googledomains.com.
ns-cloud-b3.googledomains.com.
ns-cloud-b4.googledomains.com.
and we are using all of them in our registrar.
~ dig NS --.shop +short
ns-cloud-b4.googledomains.com.
ns-cloud-b1.googledomains.com.
ns-cloud-b2.googledomains.com.
ns-cloud-b3.googledomains.com.
I know that GCP discourages using ping/icmp because it's not necessarily representative of latency of other traffic, but the ping times from the west coast imply that the packets are going cross country:
PING ns-cloud-b1.googledomains.com (216.239.32.107): 56 data bytes
64 bytes from 216.239.32.107: icmp_seq=0 ttl=58 time=65.699 ms
64 bytes from 216.239.32.107: icmp_seq=1 ttl=58 time=67.458 ms
64 bytes from 216.239.32.107: icmp_seq=2 ttl=58 time=66.873 ms
PING ns-cloud-b2.googledomains.com (216.239.34.107): 56 data bytes
64 bytes from 216.239.34.107: icmp_seq=0 ttl=58 time=85.820 ms
64 bytes from 216.239.34.107: icmp_seq=1 ttl=58 time=87.567 ms
64 bytes from 216.239.34.107: icmp_seq=2 ttl=58 time=84.580 ms
I also confirmed that this latency exists with the cogent looking glass traceroute/ping: https://www.cogentco.com/en/looking-glass. The GCP docs say:
Your users will have reliable, low-latency access from anywhere in the world using our anycast name servers.
but the performance we're seeing seems like our DNS queries are being served from a central location. We are using the .shop TLD, but I saw similar performance for another URL using the .app TLD, so the problem doesn't seem to be the TLD DNS servers.
Extra Data
The latency we're concerned with is from our user's devices over the public internet, but to remove ISP differences, here is some more data on dns query latency from VMs within GCP. The latency isn't terrible, but each location has decently different speeds and from within GCP I would expect all the name servers to be quick (<25ms).
# us-central1
alex@alex-1-central:~$ dig @ns-cloud-b1.googledomains.com --.shop | grep time
;; Query time: 16 msec
alex@alex-1-central:~$ dig @ns-cloud-b2.googledomains.com --.shop | grep time
;; Query time: 28 msec
alex@alex-1-central:~$ dig @ns-cloud-b3.googledomains.com --.shop | grep time
;; Query time: 20 msec
alex@alex-1-central:~$ dig @ns-cloud-b4.googledomains.com --.shop | grep time
;; Query time: 0 msec
# us-west2
alex@alex-1-west2:~$ dig @ns-cloud-b1.googledomains.com --.shop | grep time
;; Query time: 40 msec
alex@alex-1-west2:~$ dig @ns-cloud-b2.googledomains.com --.shop | grep time
;; Query time: 60 msec
alex@alex-1-west2:~$ dig @ns-cloud-b3.googledomains.com --.shop | grep time
;; Query time: 52 msec
alex@alex-1-west2:~$ dig @ns-cloud-b4.googledomains.com --.shop | grep time
;; Query time: 48 msec
# us-east1
alex@alex-1-east:~$ dig @ns-cloud-b1.googledomains.com --.shop | grep time
;; Query time: 28 msec
alex@alex-1-east:~$ dig @ns-cloud-b2.googledomains.com --.shop | grep time
;; Query time: 4 msec
alex@alex-1-east:~$ dig @ns-cloud-b3.googledomains.com --.shop | grep time
;; Query time: 12 msec
alex@alex-1-east:~$ dig @ns-cloud-b4.googledomains.com --.shop | grep time
;; Query time: 32 msec
I also setup a test zone in AWS Route53 and queried the authoritative nameservers directly from GCP VMs (just like I did above for GCP) and got <20ms response times from each location.