I have a simple script to download a file of size 300MB from a remote server in python using requests library. I am thinking about decreasing the download time using threading module and Queue module and and I am trying to figure out how to use these modules
I thought about having 4 threads and split the file size into four chunks and each thread. My requests download code looks like this now.
import threading
import Queue
import requests
queue = Queue.Queue()
class downloadThread(threading.Thread):
def __init__(self,queue):
self.inQueue = queue
def run():
while True:
url = self.inQueue.get()
resp = requests.get(url,stream=True)
with open('/tmp/app.zip','wb') as f:
for chunk in resp.iter_contents(chunk_size=1024):
if chunk:
f.write(chunk)
f.flush()
if __name__ == '__main__':
for x in range(3):
t = downloadThread(queue)
t.setDaemon(True)
t.start()
queue.put(http://urltofile')
When using iter_contents to retrieve content, I can provide the chunk_size but is there a way to mention start from 1024 and handle chunk till 2048? Since I am planning to use thread 1 to download 0 to 1024 and thread 2 to handle 1025 to 2048 and so on.
I have to handle the writing the file in a different logic. I am planning to reading the chunks into another queue and inQueue and then write them into a file. As of now I am trying to figure out how to split the file chunks between threads.
Thanks