-2

I'm trying to write this function so that I can pass files or folders and read from them using pandas.

import pandas as pd
import os

path = os.getcwd()
path = '..' #this would be root

revenue_folder = '../Data/Revenue'
random_file = '2017-08-01_Aug.csv'

def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
    for f in files:
        with open(os.path.join(root, csv_file)) as f1:
            pd.read_csv(f1, sep = ';')
            print(f1)

csv_reader(random_file)

FileNotFoundError: [Errno 2] No such file or directory: '../2017-08-01_Aug.csv'

I have since tried doing some changes and now the problem is that it goes to another subdirectory. What I want is to iterate through all my files and folders, find the desired file, then read it. To be clear my desired file is in the revenue_folder.

def csv_reader(csv_file):
for root, dirs, files in os.walk(path):
    for f in files:
        base, ext = os.path.splitext(f)
        if ('csv' in ext):
            print (root)
            with open(os.path.join(root, csv_file)) as f1:
                pd.read_excel(f1, sep = ':')
                print(f1)

csv_reader(random_file)

FileNotFoundError: [Errno 2] No such file or directory: './Data/Backlog/2017-08-01_Aug.csv'
Krownose
  • 13
  • 2
  • @iam Carrot This I understand, but why doesn't it iterate through all my files and finds that specific file? That is what I'm trying to do. – Krownose Feb 28 '18 at 17:10
  • As far as I can see there is no indentation in for loop does not seem to be matching correctly here. – garmoncheg Feb 28 '18 at 17:17
  • @iam.Carrot yes that is correct. – Krownose Feb 28 '18 at 19:13
  • @iam.Carrot yes it did! thank you so much. just a quick question, before you updated the answer you said something that was very useful, about putting the open argument into a variable. could you please repost it? – Krownose Mar 01 '18 at 19:48
  • @iam.Carrot awesome, everything working now. – Krownose Mar 01 '18 at 20:40

1 Answers1

0

Well after the edit, the whole scenario of the question changed. Below code searches recursively through the Files and Folders to find the files that match the criteria

def get_all_matching_files(root_path, matching_criteria):
    """
    Gets all files that match a string criteria.
    :param root_path: the root directory path from where searching needs to begin
    :param matching_criteria: a string or a tuple of strings that needs to be matched in the file n
    :return: a list of all matching files
    """
    return [os.path.join(root, name) for root, dirs, files in os.walk(root_path) for name in files

            if name.endswith(matching_criteria)]


def main(root_path):
    """
    The main method to start finding the file.
    :param root_path: The root dir where the search needs to be started.
    :return: None
    """
    if len(root_path) < 2:
        raise ValueError('The root path must be more than 2 characters')

    all_matching_files = get_all_matching_files(root_path, '2017-08-01_Aug.csv')
    if not all_matching_files:
        print('no files were found matching that criteria.')
        return

    for matched_files in all_matching_files:
        data_frame = pd.read_csv(matched_files)
        # your code here on what to do with the dataframe


    print('Completed search!')


if __name__ == '__main__':
    root_dir_path = os.getcwd()
    main(root_dir_path)

Notice the endswith() I've used to match the files, this is such that you can have the flexibility to send in a file extension (.csv) and get all files. Also the endswith() takes in a tuple as well so create a tuple of all files or extensions and the method would work.

Other Suggestions:

When trying to read a file using pandas you don't the code:

with open(os.path.join(root, csv_file)) as f1:
    pd.read_csv(f1, sep = ';')
    print(f1)

on the contrary you need to do:

# set the file path into a variable to make code readable
filepath = os.path.join(revenue_folder, random_file)
# read the data and store it into a variable of type DataFrame
my_dataframe_from_file = pd.read_csv(filepath,sep=';')
iam.Carrot
  • 4,976
  • 2
  • 24
  • 71