I tried to pass the domain from the URL dataset and get the domain from the whois database.
However, when I run the Jupyter notebook, it stuck.
Functions:
def get_features(url,label):
features = []
....
dnsrecord = 0
try:
domain = whois.whois(urlparse(url).netloc)
except:
dnsrecord = 1
features.append(dnsrecord)
....
return features
To run:
features = []
for i in range (0,len(df)):
url = df['url'][i]
label = df['label'][i]
features.append(feature_extraction(url,label))
I go to search the webpage, and it shows as below:
This site can’t be reached
Check if there is a typo in www.content.usatoday.com.
If spelling is correct, try running Windows Network Diagnostics.
DNS_PROBE_FINISHED_NXDOMAIN
How do I make except this situation when the webpage is "DNS_PROBE_FINISHED_NXDOMAIN"?