So, I've been able to use multiprocessing to upload multiple files at once to a given server with the following two functions:
import ftplib,multiprocessing,subprocess
def upload(t):
server=locker.server,user=locker.user,password=locker.password,service=locker.service #These all just return strings representing the various fields I will need.
ftp=ftplib.FTP(server)
ftp.login(user=user,passwd=password,acct="")
ftp.storbinary("STOR "+t.split('/')[-1], open(t,"rb"))
ftp.close() # Doesn't seem to be necessary, same thing happens whether I close this or not
def ftp_upload(t=files,server=locker.server,user=locker.user,password=locker.password,service=locker.service):
parsed_targets=parse_it(t)
ftp=ftplib.FTP(server)
ftp.login(user=user,passwd=password,acct="")
remote_files=ftp.nlst(".")
ftp.close()
files_already_on_server=[f for f in t if f.split("/")[-1] in remote_files]
files_to_upload=[f for f in t if not f in files_already_on_server]
connections_to_make=3 #The maximum connections allowed the the server is 5, and this error will pop up even if I use 1
pool=multiprocessing.Pool(processes=connections_to_make)
pool.map(upload,files_to_upload)
My problem is that I (very regularly) end up getting errors such as:
File "/usr/lib/python2.7/multiprocessing/pool.py", line 227, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 528, in get
raise self._value
ftplib.error_temp: 421 Too many connections (5) from this IP
Note: There's also a timeout error that occasionally occurs, but I'm waiting for it to rear it's ugly head again, at which point I'll post it.
I don't get this error when I use the command line (i.e. "ftp -inv", "open SERVER", "user USERNAME PASSWORD", "mput *.rar"), even when I have (for example) 3 instances of this running at once.
I've read through the ftplib and multiprocessing documentation, and I can't figure out what it is that is causing these errors. This is somewhat of a problem because I'm regularly backing up a large amount of data and a large number of files.
- Is there some way I can avoid these errors or is there a different way of having the/a script do this?
- Is there a way I can tell the script that if it has this error, it should wait for a second, and then resume it's work?
- Is there a way I can have the script upload the files in the same order they are in the list (of course speed differences would mean they wouldn't all always be 4 consecutive files, but at the moment the order seems basically random)?
- Can someone explain why/how more connections are being simultaneously made to this server than the script is calling for?
So, just handling the exceptions seems to be working (except for the occasional recursion error...still have no fucking idea what the hell is going on there).
As per #3, I wasn't looking for that to be 100% in order, only that the script would pick the next file in the list to upload (so differences in processes speeds could/would still cause the order not to be completely sequential, there would be less variability than in the current system, which seems to be almost unordered).