4

I need to download some large files (>30GB per file) from a FTP server. I'm using ftplib from the python standardlib but there are some pitfalls: If i download a large file, i can not use the connection anymore if the file finishes. I get an EOF Error afterwards, so the connection is closed (due to timeout?) and for each succeeding file i will get an error 421.

From what i read, there are two connections. The data and control channel, where the data channel seems to work correctly (i can download the file completly) but the control channels times out in the meantime. I also read that the ftplib (and other python ftp libraries) are not suited for large files and may only support files up to around 1GB. There is a similar question to this topic here: How to download big file in python via ftp (with monitoring & reconnect)? which is not quite the same because my files are huge in comparison.

My current code looks like this:

import ftplib
import tempfile

ftp = ftplib.FTP_TLS()

ftp.connect(host=server, port=port)
ftp.login(user=user, passwd=password)
ftp.prot_p()
ftp.cwd(folder)

for file in ftp.nlst():
    fd, local_filename = tempfile.mkstemp()
    f = open(fd, "wb")
    ftp.retrbinary('RETR %s' % file, callback=f.write, blocksize=8192)
    f.close()

Is there any tweak to it or another library that i can use, which does support huge files?

Community
  • 1
  • 1
reox
  • 5,036
  • 11
  • 53
  • 98

1 Answers1

0

If you experience issues with standard FTP, you can try using a different protocol that is specifically designed to handle such large files.

A number of suitable solutions exist. Rsync would probably be a good way to start.

rth
  • 10,680
  • 7
  • 53
  • 77