0

I am trying to find a way to find the CDN that is serving a certain domain in Python.

My idea is to use DNS lookups to read the field CNAME in the response. From that I can map the CNAME response to a certain DNS provider. I am aware of a similar threat in How can I filter the domains served by a CDN from a list of domain names?. However, as it is highlighted there, there is no guarantee to obtain the CNAME for each domain tested. I wonder then if there is another method to find the corresponding CDN for a domain when the CNAME is not in the response in Python. So far I have tried:

# I am using Python 3.7.0 and the dnspython library
import dns.resolver

dns_results = dns.resolver.query('youtube.com', 'CNAME')

I am getting an error like this:

NoAnswer: The DNS response does not contain an answer to the question: youtube.com. IN CNAME

When I was expecting the answer to be Google.

Thank you for your help.

Paul
  • 165
  • 1
  • 1
  • 13
  • 1
    1) Not all CDNs can be found by checking CNAMEs, some do not use it and 2) youtube.com CNAME DNS query, as any other CNAME query on an apex will never return records, as a CNAME can not exist at apex. If you do www.youtube.com CNAME then it is another matter... – Patrick Mevzek Aug 09 '19 at 17:34
  • Thanks @Patrick Mevzek for your help. Can you clarify why www.youtube.com and youtube.com produce different answers? – Paul Aug 09 '19 at 20:27
  • 1
    `youtube.com` is a domain name and hence at its apex (when querying directly for it) you can not have `CNAME` records this is per DNS specifications: a `CNAME` record can not coexist with anything else and apex already has `NS` and `SOA` records. Any other name below the apex can have a `CNAME` if it does not have any other record type. – Patrick Mevzek Aug 09 '19 at 20:37
  • Should I use then the URL (index page) instead? – Paul Aug 09 '19 at 20:47

3 Answers3

1

Maybe something like this:

>>> import ipwhois
>>> import dns.resolver
>>> result = dns.resolver.query('youtube.com', 'A')
>>> print(ipwhois.IPWhois(result[0].to_text()).lookup_whois()["nets"][0]["description"])
Google LLC
>>> result = dns.resolver.query('reddit.com', 'A')
>>> print(ipwhois.IPWhois(result[0].to_text()).lookup_whois()["nets"][0]["description"])
Fastly
>>> result = dns.resolver.query('imgur.com', 'A')
>>> print(ipwhois.IPWhois(result[0].to_text()).lookup_whois()["nets"][0]["description"])
Fastly
>>> result = dns.resolver.query('stackoverflow.com', 'A')
>>> print(ipwhois.IPWhois(result[0].to_text()).lookup_whois()["nets"][0]["description"])
Fastly
>>> result = dns.resolver.query('www.primevideo.com', 'A')
>>> print(ipwhois.IPWhois(result[0].to_text()).lookup_whois()["nets"][0]["description"])
Amazon Technologies Inc.
Dusan Bajic
  • 10,249
  • 3
  • 33
  • 43
  • Thanks for your help @Dusan Bajic. However, I think that the info you got corresponds to the short name of the AS. I am looking for the CDN serving the domain. – Paul Aug 09 '19 at 20:25
  • I also wonder why you think that field in the lookup_whois() dictionary provides a clue of the CDN serving the domain. – Paul Aug 11 '19 at 16:07
0

You can lookup PTR record for IP address of the site being checked.

$host -t A youtube.com
youtube.com has address 216.58.195.78
$host -t PTR 216.58.195.78
78.195.58.216.in-addr.arpa domain name pointer sfo07s16-in-f78.1e100.net.

And in some table you should point 1e100.net to Google, cloudfront.net to Amazon, etc.

This also is not 100% reliable. Your code should also process error responses: NXDOMAIN for IP addresses that are not in the zone in-addr.arpa and most probably are not served by CDN; SERVFAIL for malfunctioning DNS servers.

Yuri Ginsburg
  • 2,302
  • 2
  • 13
  • 16
0

Just a note that CNAME RR is not the only mechanism to route traffic to a CDN. Some people could be setting an Anycast IP address of a CDN directly from their DNS (A/AAAA RRs). Probably doing whois on IP address would be a more holistic approach. Another thing to keep in mind that your solution may resolve a current, but not the only CDN that is used for the asset. Some websites are using NS1, Route53, Cedexis or even DLVR (for video) with multi-CDN approach.

h4ck3r
  • 1