The idea of the program is to check for domains/subdomains living (via http/https protocol) in the subdomains.txt file.
I did this by using HEAD requests
to domains/subdomains and receiving the response status code. If the status code is available, the domain or subdomain is live. (load_url_http
function)
To speed up the program, I used concurrent.futures.ThreadPoolExecutor with a number of threads of 200 However, even after increasing the number of threads to 300, the program's speed isn't much improved.
I want an improvement in my program to be able to send thousands of requests at once. Below is part of my source code:
python-request-multil.py
import time
import requests
import concurrent.futures
def load_url_http(protocol: str, domain: str, timeout: int = 10):
try:
conn = requests.head(protocol + "://" + domain, timeout=timeout)
return conn.status_code
except Exception:
return None
#--- main ---#
start_time = time.time()
worker = 400
protocol = "http"
timeout = 10
print("Number of worker:", worker)
with concurrent.futures.ThreadPoolExecutor(max_workers=worker) as executor:
# The file object that the subdomain lives on will be written to
file_live_subdomain = open("live_subdomains.txt", "a")
# load domain/subdomain list from file
URLS = open("subdomains.txt", "r").read().split("\n")
URLS_length = len(URLS)
# Count the number of live subdomains
live_count = 0
# Start the load operations and mark each future with its URL
future_to_url = {
executor.submit(load_url_http, protocol, url, timeout): url for url in URLS
}
for i, future in zip(range(URLS_length), concurrent.futures.as_completed(future_to_url)):
url = future_to_url[future]
print(f"\r--> Checking live subdomain.........{i+1}/{URLS_length}", end="")
try:
data = future.result()
# If `load_url_http` returns any status code
if data != None:
# print(f'{protocol}://{url}:{data}')
live_count = live_count + 1
file_live_subdomain.write(f"\n{protocol}://" + url)
except Exception as exc:
print(exc)
print(f"\n[+] Live domain: {live_count}/{URLS_length}", end="")
file_live_subdomain.close()
print("\n--- %s seconds ---" % (time.time() - start_time))
Run:
┌──(quangtb㉿QuangTB)-[/mnt/e/DATA/Downloads]
└─$ python3 python-request-multil.py
Number of worker: 100
--> Checking live subdomain.........1117/1117
[+] Live domain: 344/1117
--- 67.41670227050781 seconds ---
┌──(quangtb㉿QuangTB)-[/mnt/e/DATA/Downloads]
└─$ python3 python-request-multil.py
Number of worker: 200
--> Checking live subdomain.........1117/1117
[+] Live domain: 344/1117
--- 54.6825795173645 seconds ---
┌──(quangtb㉿QuangTB)-[/mnt/e/DATA/Downloads]
└─$ python3 python-request-multil.py
Number of worker: 300
--> Checking live subdomain.........1117/1117
[+] Live domain: 339/1117
--- 54.186068058013916 seconds ---
┌──(quangtb㉿QuangTB)-[/mnt/e/DATA/Downloads]
└─$ python3 python-request-multil.py
Number of worker: 400
--> Checking live subdomain.........1117/1117
[+] Live domain: 344/1117
--- 54.19181728363037 seconds ---