1

Still related to following Question... Parallel downloads with Multiprocessing and PySftp

I'd like to know how to print the successfully downloads? My intention behind that is actually append a record in a database table in order to create a log of downloaded files with the filename, date and time.

Any ideas? I've searched for some examples and made some tests, but it seems that my download module can't return anything or I'm not using the right code to read the results and print it.

DOWNLOAD function

import pysftp
import os

def fdownload(vfileaux):

    vtmpspl = vfileaux.split(',')

    vfile = vtmpspl[0]
    vhost = vtmpspl[1]
    vlogin = vtmpspl[2]
    vpwd = vtmpspl[3]
    vftppath = vtmpspl[4]
    vlocalpath = vtmpspl[5]

    os.chdir(vlocalpath)
    os.getcwd()

    cnopts = pysftp.CnOpts()
    cnopts.hostkeys = None

    vfilecheck = vlocalpath + '/' + vfile

    if not os.path.isfile(vfilecheck):

        vftpaux = pysftp.Connection(host=vhost, username=vlogin, password=vpwd, cnopts=cnopts)
        vftpaux.cwd(vftppath)
        vftpaux.get(vfile, preserve_mtime=True)
        vftpaux.close()

        return vnename + "_" + vdatetime

    else:
        pass

MAIN function

from datetime import *
from ffilelist import *
from ffilefilter import *
from developing.fdownload import *
import pymysql.cursors
from concurrent.futures import ThreadPoolExecutor, wait, as_completed

def main():

    print(datetime.datetime.now(), 'Loading variables...')

    vhostlist = {}
    vloginlist = {}
    vpwdlist = {}
    vftppathlist = {}
    vlocalpathlist = {}

    vhostaux = '10.11.12.13'
    vhostlist[vhostaux] = vhostaux
    vloginlist[vhostaux] = 'admin'
    vpwdlist[vhostaux] = 'pass1234'
    vftppathlist[vhostaux] = '/export/home'
    vlocalpathlist[vhostaux] = 'd:/test/'

    vfilelist1 = []

    global vfilelist2
    vfilelist2 = []

    for vhosttmp in vhostlist:

        print(datetime.datetime.now(), 'Starting to process ' + vhosttmp + "...")

        global vhost
        global vlogin
        global vpwd
        global vftppath
        global vlocalpath

        vhost = vhostlist[vhosttmp]
        vlogin = vloginlist[vhosttmp]
        vpwd = vpwdlist[vhosttmp]
        vftppath = vftppathlist[vhosttmp]
        vlocalpath = vlocalpathlist[vhosttmp]

        vfilelist1 = ffilelist(vhost, vlogin, vpwd, vftppath)

        print(datetime.datetime.now(), 'Vectorizing download file     list...')

        for vfile in vfilelist1:
            vfilelist2.append(vfile + ',' + vhost + ',' + vlogin + ',' +     vpwd + ',' + vftppath + ',' + vlocalpath)

    vfilelist0 = ffilefilter(vfilelist2)

    print(datetime.datetime.now(), 'Starting simultaneous downloads...')

    vpool = concurrent.futures.ProcessPoolExecutor(max_workers=8)
    vpool.map(fdownload, vfilelist0)
    vpool.shutdown()

    print(datetime.datetime.now(), 'Downloads finished!')

The INSERT string for the log to be stored in a MARIADB, is something like this. Already tested and working. To be used in MAIN function as soon as I find a solution to get the list of downloaded files.

vconn = pymysql.connect(host='localhost', user='root', password='pass1234', db='test')
vcurs = vconn.cursor()
vsql = "INSERT INTO `logs_download` (`ne`, `datetime`) VALUES (\'" + vnename + "\', \'" + vdatetime + "\')"
vcurs.execute(vsql)
vconn.commit()
Thiago Matsui
  • 65
  • 1
  • 4
  • If you're trying to retrieve the values from `return vnename + "_" + vdatetime` in `fdownload`, they'll be in the result of `vpool.map(fdownload, vfilelist0)` which returns a list. Try `print(vpool.map(fdownload, vfilelist0))`. – Alex Hall May 08 '18 at 15:38

1 Answers1

0

i've tried what Alex suggested... so I changed part of the code:

vpool = concurrent.futures.ThreadPoolExecutor(max_workers=8) 
print(vpool.map(fdownload, vfilelist0)) 
vpool.shutdown()

...but got the following results:

2018-05-08 12:44:25.115066 Loading variables...
2018-05-08 12:44:25.115066 Starting to process 10.11.12.13...
2018-05-08 12:44:25.115066 Disabling known hosts...
2018-05-08 12:44:25.115066 Opening FTP connection...
2018-05-08 12:44:26.567149 Reading objects list...
2018-05-08 12:45:30.580015 Separating files from folders...
2018-05-08 12:45:30.584015 Closing FTP connection...
2018-05-08 12:45:30.585015 Vectorizing download file list...
2018-05-08 12:45:30.596016 Filtering latest file for each object...
2018-05-08 12:45:30.648019 Vectorizing only latest files...
2018-05-08 12:45:30.652019 Starting simultaneous downloads...
<generator object Executor.map.<locals>.result_iterator at 0x04332D20>
Thiago Matsui
  • 65
  • 1
  • 4
  • Sorry, I assumed it was a list. You can convert it easily: `print(list(vpool.map(fdownload, vfilelist0)))` – Alex Hall May 08 '18 at 16:05