0

I tried to pass the domain from the URL dataset and get the domain from the whois database.

However, when I run the Jupyter notebook, it stuck.

Functions:

def get_features(url,label):
    
    features = []
    ....
    
    dnsrecord = 0
    try:
        domain = whois.whois(urlparse(url).netloc)
    except:
        dnsrecord = 1 
     
    features.append(dnsrecord)
    
    ....
    return features 

To run:

features = []
    
for i in range (0,len(df)):
    url = df['url'][i]
    label = df['label'][i]
    features.append(feature_extraction(url,label))

I go to search the webpage, and it shows as below:

This site can’t be reached
Check if there is a typo in www.content.usatoday.com.
If spelling is correct, try running Windows Network Diagnostics.
DNS_PROBE_FINISHED_NXDOMAIN

How do I make except this situation when the webpage is "DNS_PROBE_FINISHED_NXDOMAIN"?

wong
  • 3
  • 3

1 Answers1

0

Check that if the domain name does not exist or if there is a problem with the DNS server.

Try to modify your code as

import socket

def get_features(url,label):
features = []
....

dnsrecord = 0
try:
    domain =       whois.whois(urlparse(url).netloc)
except socket.gaierror:
    dnsrecord = 1 
 
features.append(dnsrecord)

....
return features