4

I have a threaded FTP script. While the data socket is receiving the data, a threaded loop sends NOOP commands to the control socket to keep control connection alive during large transfers.

I am prevented from using the FTP.retrbinary() command as, if I want to keep the control connection alive I must separate the data and control sockets which retrbinary does not do.

Code below:

def downloadFile(filename, folder):
    myhost = 'HOST'
    myuser = 'USER'
    passw = 'PASS'
    #login
    ftp = FTP(myhost,myuser,passw)

    ftp.set_debuglevel(2)
    ftp.voidcmd('TYPE I')
    sock = ftp.transfercmd('RETR ' + filename)
    def background():
        f = open(folder + filename, 'wb')
        while True:
            block = sock.recv(1024*1024)
            if not block:
                break
            f.write(block)
        sock.close()
    t = threading.Thread(target=background)
    t.start()
    while t.is_alive():
        t.join(120)
        ftp.voidcmd('NOOP')
    ftp.quit();


My PROBLEM: FTP.transfercmd("RETR " + filename) defaults to ASCII transfers and Im transferring video so it has to be Binary (hence the ftp.voidcmd('TYPE I) call to force Binary mode).

If I DONT call ftp.voidcmd('TYPE I) the NOOP commands get send successfully and the output is as follows:

*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'
*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'
*cmd* 'NOOP'
*put* 'NOOP\r\n'
*get* '200 NOOP: data transfer in progress\n'
*resp* '200 NOOP: data transfer in progress'

etc. But the file is in ASCII and therefor corrupted. If I DO call ftp.voidcmd('TYPE I), The NOOP command only sends once, and the control socket doesnt respond until the transfer completes. If the file is large, the control socket times out as if the NOOPs were never sent...

Very strange, but I am sure its simple. It seems as though the transfercmd() is not splitting the control and data sockets as it is supposed to... and therefore the ftp var is not seperated from the data stream... or something. strange.

Thanks in advance for any advice you can offer.

hammus
  • 2,602
  • 2
  • 19
  • 37
  • Instead of using both `FTP.voidcmd(TYPE I)` and `FTP.transfercmd()`, have you tried to use `FTP.retrbinary()` instead? – uselpa Nov 15 '13 at 22:34
  • @uselpa - Thanks for the response. Yes retrbinary is not an option as it does not return the data socket, which is necessary to send separate commands to the control socket. – hammus Nov 18 '13 at 21:48
  • please try `tcpdump` and/or `strace` to narrow down the problem. I'm a bit surprised to see "data transfer in progress" in response to `NOOP`, perhaps that's an `ftplib` artefact. – Dima Tisnek Nov 21 '13 at 16:35
  • on an unrelated note, why do you need to keep control connection around anyway? is this a smallest test case, or is this your entire code? if it's a simple download, there are tons of other libraries, e.g. `pycurl`. – Dima Tisnek Nov 21 '13 at 16:37
  • duplicate of http://stackoverflow.com/questions/5545666/threaded-noop-command-during-retrbinary – Dima Tisnek Nov 21 '13 at 16:39
  • @qarma - re: keeping the control connection around: Im doing it so that I can tell when the transfer is finished... otherwise ftplib will wait forever for a transfer complete response... I will look into the pseudocode you have suggested at the bottom of your answer – hammus Nov 21 '13 at 20:12

1 Answers1

3

tcpdump confirms that server only sends 226 Transfer complete. after entire file was sent by the server.

I suspect that's part of FTP specification.

In fact, look at retrbinary code in ftplib.py:

    self.voidcmd('TYPE I')
    conn = self.transfercmd(cmd, rest)
    while 1:
        data = conn.recv(blocksize)
        if not data:
            break
        callback(data)
    conn.close()
    return self.voidresp()

The last line expects to get tranfer result (as known to server) only after tranfer is complete.

In fact it seems your code is missing voidresp() bit.

I am not very familiar with ftp, from what I've seen background downloaders like lftp actually open new control connection for each parallel download.

You have a valid concern if your file is really large.

There are many extensions to FTP, there may be something that does what you want.

Alternatively you can make a loop likes this:

pos = 0
while not full file:
    command REST
    download for a while in separate thread
    command ABRT
    wait for separate thread to abort
    pos += length of downloaded chunk
Dima Tisnek
  • 11,241
  • 4
  • 68
  • 120
  • Thanks, bounty awarded, I will persist with your pseudo code solution. Thanks for your time. – hammus Nov 21 '13 at 23:14
  • I believe the problem is that after some time the control channel is closed, and the `226 Transfer complete` is not sent... – DejanLekic Jan 10 '19 at 16:29