2

I currently have a python script that runs on my local machine that uploads some files by calling a basic API I built using Flask that's on a remote machine. The script basically iterates thru a list of file paths, checks for some criteria and uploads the file if it meets the criteria. Here is stripped down version of the script for clarity:

def upload_file(path):

    url = 'http://mydomain.com/upload'
    head,tail = os.path.split(path)
    files = {'files': (tail, open(path, 'rb'))}
    r = requests.post(url,files=files) 
    return r.status_code


def uploadCallback(status):
    if status == 200:
        print "file was successfully uploaded, now do something cool"
    else:
        print "something went wrong"

paths = ['/Users/myFiles/file1.txt', '/Users/myFiles/file2.txt']

meets_criteria = True

for path in paths:
    if meets_criteria: #imagine we check to see if its extension is .txt
        d = threads.deferToThread(upload_file, path)
        d.addCallback(uploadCallback)
    else:
        pass

reactor.run()

The problem is that this script is blocking the uploading of other files as they are detected. They all eventually upload but I need to send each file asynchronously but not simultaneously. I looked into grequests, requests-futures, asyncore, Twisted, etc. Twisted looks like the best option, but will require me learning Twisted and redesigning the script. Any tips/opinions on whether this is the route I should go would be greatly appreciated. Thanks in advance.

django-d
  • 2,210
  • 3
  • 23
  • 41
  • I'm not sure what answer to give you other than "yes, you should use Twisted for this". This is really vague. Have you tried to write it using Twisted? What problems did you encounter? – Glyph Nov 29 '13 at 03:06
  • @Glyph I've edited the code to look more like a twisted implementation. I haven't implemented it because the real code will require some major rework. Just trying to see if I'm on the right track. Will this "sample code" comply with my described scenario? – django-d Nov 29 '13 at 18:27
  • OK, I went ahead and tested the code above and it does indeed work as I need. Thanks for confirming my hunch that twisted is the right solution. – django-d Nov 29 '13 at 23:06
  • if the only feature from twisted that you are using is deferToThread then you could use `multiprocessing.ThreadPool` as alternative e.g.: `for status in mp.ThreadPool(20).imap_unordered(upload_file, paths): print("success" if status == 200 else "error")` uploads 20 files at a time (concurrently). – jfs Dec 16 '13 at 08:06

0 Answers0