I have some problems with a script that I am doing to facilitate a process that I have at work. The script consists of listing each website that you have in a txt file, and making another list but with different paths for the URLs
Finally I make a last list, and each element of that list is the web with the path together, for example ('http://web.com/path/')
and lastly I have to go to each web page with requests.get()
and verify that this path does exist in that web page, and I verify that by evaluating the status_code, if the status_code is 200 I add it to a list with the websites that gave a positive result and if not, I only go to the next one.
import requests
direccion_archivo = 'webs.txt'
webs_list = []
with open(direccion_archivo) as archivo:
for linea in archivo:
webs_list.append(linea.rstrip())
direccion_archivo_02 = 'directorios.txt'
direcs_list = []
with open(direccion_archivo_02) as archivo_02:
for linea in archivo_02:
direcs_list.append(linea.rstrip())
urls = []
for web in webs_list:
for direct in direcs_list:
link = web + direct
urls.append(link.rstrip())
AdminPanels_websites = []
for website in urls:
getweb = requests.get(website)
SSLWeb = requests.exceptions.SSLError(website)
if SSLWeb is True:
pass
if getweb.elapsed.total_seconds() >= 1:
pass
if getweb.status_code == 200:
AdminPanels_websites.append(website)
print(AdminPanels_websites)
else:
pass
with open("paneles.txt", "w") as archivo_03:
for panel in AdminPanels_websites:
archivo_03.write(panel.rstrip())
archivo_03.write("\n")
But there is a problem that I don't know how to solve it, when I run the script, everything is fine, but suddenly I get the following errors:
TimeoutError: [WinError 10060] A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPSConnection object at 0x000001836654D9F0>, 'Connection to www.actionplastics.co.za timed out. (connect timeout=None)')
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='www.actionplastics.co.za', port=443): Max retries exceeded with url:/products/ (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection object at 0x000001836654D9F0>, 'Connection to www.actionplastics.co.za timed out. (connect timeout=None)'))
I have investigated about these errors, and it is because I am sending several requests to a URL, but if I am only sending a single request to a single web page, then why do I have this error?
I would like to know what you would do in such cases, I would really appreciate it <3
I have tried with that conditional, I put it in the last part of the code starting from :
for website in urls:
getweb = requests.get(website)
SSLWeb = requests.exceptions.SSLError(website)
and after that, I put:
ConnectionWeb = requests.exceptions.ConnectionError(website)
if ConnectionWeb is True:
pass
But it does not work