I have two lists of files that I'm pulling from an FTP folder using:
sFiles = ftp.nlst(date+'sales.csv')
oFiles = ftp.nlst(date+'orders.csv')
This results with two lists looking something like:
sFiles = ['20170822_sales.csv','20170824_sales.csv','20170825_sales.csv','20170826_sales.csv','20170827_sales.csv','20170828_sales.csv']
oFiles = ['20170822_orders.csv','20170823_orders.csv','20170824_orders.csv','20170825_orders.csv','20170826_orders.csv','20170827_orders.csv']
With my real data-set, something like...
for sales, orders in zip(sorted(sFiles),sorted(oFiles)):
df = pd.concat(...)
Gets my desired result, but there are going to be times where something goes wrong and both files do not make it into the proper FTP folder, so I'd like some code that will create an iterable object where I can extract the matched orders and sales file name based on date.
The following works... I'm not sure what "pythonic" score I'd give it. Poor readability, but it is a comprehension, so I'd imagine there are performance gains?
[(sales, orders) for sales in sFiles for orders in oFiles if re.search(r'\d+',sales).group(0) == re.search(r'\d+',orders).group(0)]