4

I'm learning python and trying to write a code to sync two directories: one is on ftp server, the other is on my local disk. So far, I wrote a working code but I have a question or two about it :)

import os
from ftplib import FTP

h_local_files = [] # create local dir list
h_remote_files = [] # create remote dir list

h_local = 'C:\\something\\bla\\' # local dir

ftp = FTP('ftp.server.com')
ftp.login('user', 'pass')

if os.listdir(h_local) == []:
    print 'LOCAL DIR IS EMPTY'
else:
    print 'BUILDING LOCAL DIR FILE LIST...'
    for file_name in os.listdir(h_local):
        h_local_files.append(file_name) # populate local dir list

ftp.sendcmd('CWD /some/ftp/directory')
print 'BUILDING REMOTE DIR FILE LIST...\n'
for rfile in ftp.nlst():
    if rfile.endswith('.jpg'): # i need only .jpg files
        h_remote_files.append(rfile) # populate remote dir list

h_diff = sorted(list(set(h_remote_files) - set(h_local_files))) # difference between two lists

for h in h_diff:
    with open(os.path.join(h_local,h), 'wb') as ftpfile:
        s = ftp.retrbinary('RETR ' + h, ftpfile.write) # retrieve file
        print 'PROCESSING', h
        if str(s).startswith('226'): # comes from ftp status: '226 Transfer complete.'
            print 'OK\n' # print 'OK' if transfer was successful
        else:
            print s # if error, print retrbinary's return

This piece of code should make two python lists: a list of files in local directory and a list of files in ftp directory. After removing duplicates from lists, the script should download 'missing' files to my local directory.

For now, this piece of code is doing what I need, but I have noticed that when I run it my output is not acting how I imagine it would act :)

For example, my current output goes:

PROCESSING 2012-01-17_07.05.jpg
OK

# LONG PAUSE HERE

PROCESSING 2012-01-17_07.06.jpg
OK

# LONG PAUSE HERE

PROCESSING 2012-01-17_07.06.jpg
OK

etc...

but I imagine that it should work like this:

PROCESSING 2012-01-17_07.05.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

PROCESSING 2012-01-17_07.06.jpg
# LONG PAUSE HERE (WHILE DOWNLOADING)
OK

etc...

As I said, I just started to learn python, and maybe I'm doing some stuff here completely wrong (if str(s).startswith('226')????). Maybe I cannot achieve this withftplib only? So in the end my questions are:

What am I doing wrong here? :)
How to produce 'proper' output and is there a way to print some kind of status while downloading a file (at least a line of dots), for example:

PROCESSING 2012-01-17_07.05.jpg
..........
OK

PROCESSING 2012-01-17_07.06.jpg
......
OK

PROCESSING 2012-01-17_07.06.jpg
...............
OK

etc...

Thanks a lot for helping!

errata
  • 5,695
  • 10
  • 54
  • 99

1 Answers1

3

retrybinary blocks until it is complete. This is why you see Processing ZZZ\n OK immediately, because it occurs after the call to retrbinary has completed.

If you want to print . for each call, then you need to provide a callback function to do this. here is the docstring for retrbinary:

    """Retrieve data in binary mode.  A new port is created for you.

    Args:
      cmd: A RETR command.
      callback: A single parameter callable to be called on each
                block of data read.
      blocksize: The maximum number of bytes to read from the
                 socket at one time.  [default: 8192]
      rest: Passed to transfercmd().  [default: None]

    Returns:
      The response code.
    """

So, you need to provide a different callback that both writes the file and prints out '.'

import sys # At the top of your module.

# Modify your retrbinary    
ftp.retrbinary('RETR ' + h, lambda s: ftpfile.write(s) and sys.stdout.write('.'))

You may have to edit that snippet of code, but it ought to give you an idea of what to do.

jaime
  • 2,234
  • 1
  • 19
  • 22