1

I am using the following code to upload large files over a flaky FTP, which also is able to resume the upload if there is an error.

#!/usr/bin/env python3
import ftplib
import os
import sys
import time
import socket

from configparser import ConfigParser
from collections import OrderedDict

class FtpUploadTracker:
    sizeWritten = 0
    totalSize = 0.0
    lastShownPercent = 0

    def __init__(self, totalSize):
        self.totalSize = totalSize

    def handle(self, block):
        self.sizeWritten += 1024
        percentComplete = round((self.sizeWritten / self.totalSize) * 100)

        if (self.lastShownPercent != percentComplete):
            self.lastShownPercent = percentComplete
            print(str(percentComplete) + "% complete remaining: " + str(self.totalSize - self.sizeWritten), flush=True)



def ini_to_dict(path):
    """

    Read an ini path in to a dict
    :param path: Path to file
    :return: an OrderedDict of that path ini data
    """
    config = ConfigParser()
    config.read(path)
    return_value = OrderedDict()
    for section in reversed(config.sections()):
        return_value[section] = OrderedDict()
        section_tuples = config.items(section)
        for item_turple in reversed(section_tuples):
            return_value[section][item_turple[0]] = item_turple[1]
    return return_value


CONFIG_PATH = os.path.join(os.path.dirname(__file__), "config.ini")


def get_config():
    return ini_to_dict(CONFIG_PATH)

if __name__ == "__main__":
    settings = get_config()
    Server=settings["main"]["server"]
    Username=settings["main"]["username"]
    Password=settings["main"]["password"]
    FileName=sys.argv[1]
    Directory=sys.argv[2]
    tmp_dir = "/tmp"
    filename = FileName

    tries = 0
    done = False

    print("Uploading " + str(filename) + " to " + str(Directory), flush=True)
    print("Upload to temp folder", flush=True)

    while tries < 50 and not done:
        try:
            tries += 1
            with ftplib.FTP(Server) as ftp:
                ftp.set_debuglevel(2)
                print("login", flush=True)
                ftp.login(Username, Password)
                # ftp.set_pasv(False)
                ftp.cwd(tmp_dir)
                with open(filename, 'rb') as f:
                    totalSize = os.path.getsize(filename)
                    print('Total file size : ' + str(round(totalSize / 1024 / 1024 ,1)) + ' Mb', flush=True)
                    uploadTracker = FtpUploadTracker(int(totalSize))

                    # Get file size if exists
                    #import code; code.interact(local=dict(globals(), **locals())) 
                    files_list = ftp.nlst()
                    print(files_list, flush=True)
                    if os.path.basename(filename) in files_list:
                        print("Resuming", flush=True)
                        ftp.voidcmd('TYPE I')
                        rest_pos = ftp.size(os.path.basename(filename))
                        f.seek(rest_pos, 0)
                        print("seek to " + str(rest_pos))
                        uploadTracker.sizeWritten = rest_pos
                        print(ftp.storbinary('STOR ' + os.path.basename(filename), f, blocksize=1024, callback=uploadTracker.handle, rest=rest_pos), flush=True)
                    else:
                        print(ftp.storbinary('STOR ' + os.path.basename(filename), f, 1024, uploadTracker.handle), flush=True)

                    print("Rename the file now")
                    ftp.cwd(Directory)
                    ftp.sendcmd('RNFR ' + os.path.join(tmp_dir, os.path.basename(FileName)))
                    ftp.sendcmd('RNTO ' + os.path.join(Directory, os.path.basename(FileName)))
                    done = True
        except (BrokenPipeError, ftplib.error_temp, socket.gaierror) as e:
            print(str(type(e)) + ": " + str(e))
            print("connection died, trying again")
            time.sleep(30)

    if done == False:
        print("Fail to upload")
        sys.exit(1)        
    print("Done")

You also need a config.ini with the lines

[main]
server=example.org
username=your username
password= your password

However it seems that on timeout the file is lost and the upload restart from scratch. This happens frequently, with 3GB files and a 100kb/s upload speed this is hours lost.

Here is how the output looks, timestamps are next to the shell output Note there is a *cmd* 'QUIT' done by ftplib out of nowhere, possibly timeout according to the time the line is send, not by the code, and after it the file is not to be seen in ftp.nlst(), you can see a test file called yay:

07:16:46 74% complete remaining: 1015369104
07:22:36 75% complete remaining: 977053072
07:26:50 76% complete remaining: 938738064
07:31:19 77% complete remaining: 900422032
07:36:14 78% complete remaining: 862106000
07:41:05 79% complete remaining: 823789968
07:45:48 80% complete remaining: 785473936
07:51:02 81% complete remaining: 747157904
07:56:39 82% complete remaining: 708842896
08:01:34 83% complete remaining: 670526864
08:06:12 84% complete remaining: 632210832
08:11:01 85% complete remaining: 593894800
08:15:48 86% complete remaining: 555578768
08:30:16 *cmd* 'QUIT'
08:30:16 *put* 'QUIT\r\n'
08:30:16 *get* '221 Goodbye.\n'
08:30:16 *resp* '221 Goodbye.'
08:30:16 <class 'BrokenPipeError'>: [Errno 32] Broken pipe
08:30:16 connection died, trying again
08:30:16 login
08:30:17 *cmd* 'USER ****'
08:30:17 *put* 'USER ****\r\n'
08:30:17 *get* '331 Password required for ****\n'
08:30:17 *resp* '331 Password required for ****'
08:30:17 *cmd* 'PASS **********'
08:30:17 *put* 'PASS **********\r\n'
08:30:17 *get* '230 User **** logged in\n'
08:30:17 *resp* '230 User **** logged in'
08:30:17 *cmd* 'CWD /tmp'
08:30:17 *put* 'CWD /tmp\r\n'
08:30:17 *get* '250 CWD command successful\n'
08:30:17 *resp* '250 CWD command successful'
08:30:17 Total file size : 3654.1 Mb
08:30:18 *cmd* 'TYPE A'
08:30:18 *put* 'TYPE A\r\n'
08:30:18 *get* '200 Type set to A\n'
08:30:18 *resp* '200 Type set to A'
08:30:18 *cmd* 'PASV'
08:30:18 *put* 'PASV\r\n'
08:30:18 *get* '227 Entering Passive Mode (216,250,120,114,203,106).\n'
08:30:18 *resp* '227 Entering Passive Mode (216,250,120,114,203,106).'
08:30:18 *cmd* 'NLST'
08:30:18 *put* 'NLST\r\n'
08:30:18 *get* '150 Opening ASCII mode data connection for file list\n'
08:30:18 *resp* '150 Opening ASCII mode data connection for file list'
08:30:18 *get* '226 Transfer complete\n'
08:30:18 *resp* '226 Transfer complete'
08:30:18 ['.', '..', 'yay']
08:33:06 *cmd* 'TYPE I'
08:33:06 *put* 'TYPE I\r\n'
08:33:06 *get* '200 Type set to I\n'
08:33:06 *resp* '200 Type set to I'
08:33:06 *cmd* 'PASV'
08:33:06 *put* 'PASV\r\n'
08:33:06 *get* '227 Entering Passive Mode (216,250,120,114,192,207).\n'
08:33:06 *resp* '227 Entering Passive Mode (216,250,120,114,192,207).'
08:33:06 *cmd* 'STOR 2018-03-27_2018-03-13-zynthianos-stretch-lite-0.1.zip'
08:33:06 *put* 'STOR 2018-03-27_2018-03-13-zynthianos-stretch-lite-0.1.zip\r\n'
08:33:06 *get* '150 Opening BINARY mode data connection for 2018-03-27_2018-03-13-zynthianos-stretch-lite-0.1.zip\n'
08:33:06 *resp* '150 Opening BINARY mode data connection for 2018-03-27_2018-03-13-zynthianos-stretch-lite-0.1.zip'
08:33:06 1% complete remaining: 3812426128
08:38:20 2% complete remaining: 3774110096
08:43:59 3% complete remaining: 3735794064

08:49:26 4% complete remaining: 3697478032

Is there a way to make it so this does not happen?

Thanks

GuySoft
  • 1,723
  • 22
  • 30
  • If the partial file isn't there when you `NLST` on the server (assuming you're in the right directory, looking for the right name, etc.), there's nothing you can do from your Python code to fix it. – abarnert Mar 28 '18 at 06:13
  • There is one potential workaround that might work, thought: instead of creating a 3GB zipfile, create a 150-part "multi-disk" zip archive, so the parts are only 20MB. Or, if you can't use a multi-disk zip on the other side, but you do have the ability to run some kind of scripts, just split the zipfile into 20MB chunks and cat the files together on the server. – abarnert Mar 28 '18 at 06:15
  • Do you know what program/version and OS the flaky server is using? If it's something old and crufty, it may be disconnecting because of an ancient bug that can be worked around by sending a NOOP on the control channel for every N minutes or X bytes transferred on the data channel. – abarnert Mar 28 '18 at 06:28
  • I have no access to the flaky server. It was donated to host CustomPiOS images. I thought of something I could do - upload and resume in 100MB chunks, that way I have a "check point" each time I try and upload. But not sure if the file would get removed. – GuySoft Mar 28 '18 at 11:38
  • Definitely try that. I'm not sure I'd trust that an FTP server that loses your partial files on unexpected disconnects (that it apparently causes) won't lose your files—or, worse, keep the files but lose the last block and give you 1MB of 0's every 100MB—when you do this, but hopefully you can test that and see if it works. (I'm assuming you need to upload multiple files, not just get it to work once and you're done?) – abarnert Mar 28 '18 at 15:40

0 Answers0