2

I am trying to create a variable that stores a folder within the directory I am working in called TimeSeries. After that, I am trying to read each file in TimeSeries. Apparently, my error stems from df = pd.read_csv(f) being a relative path instead of an absolute path. However, I can't confirm this, as when I check isabs(direct), I get back True. I do know that the error is about that specific line, I just don't know what it is.

Code:

import pandas as pd
import numpy as np
import os

direct = os.path.abspath('TimeSeries')


for f in direct:
    df = pd.read_csv(f)
    df = df.replace(np.nan, 'Other', regex=True)
    if df.columns[0] == ['FIPS']:
        print(df.columns)
        df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
        df = df.rename(columns={'Admin2': 'County',
                                'Province_State': 'State',
                                'Country_Region': 'Country',
                                'Combined_Key': 'City'})
        df.to_csv(f)
    elif df.columns[0] == ['Province/State']:
        print(df.columns)
        df = df.drop(['Last Update'], axis=1)
        df = df.rename(columns={'Province/State': 'State',
                                'Country/Region': 'Country'})
        df.to_csv(f)
    else:
        pass

Result:

Traceback (most recent call last):
  File "C:/Users/USER/PycharmProjects/Corona Stats/Corona.py", line 9, in <module>
    df = pd.read_csv(f)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f
    return _read(filepath_or_buffer, kwds)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 448, in _read
    parser = TextFileReader(fp_or_buf, **kwds)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 880, in __init__
    self._make_engine(self.engine)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1114, in _make_engine
    self._engine = CParserWrapper(self.f, **self.options)
  File "C:\Users\USER\AppData\Local\Programs\Python\Python38\lib\site-packages\pandas\io\parsers.py", line 1891, in __init__
    self._reader = parsers.TextReader(src, **kwds)
  File "pandas\_libs\parsers.pyx", line 374, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas\_libs\parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File C does not exist: 'C'

Process finished with exit code 1

This is what happens when I print direct.

C:\Users\USER\PycharmProjects\Corona Stats\TimeSeries
Luck Box
  • 90
  • 1
  • 13
  • `C:\Users\USER\PycharmProjects\Corona Stats\TimeSeries` – Luck Box Apr 03 '20 at 04:39
  • 1
    you should include it in the question. – monkut Apr 03 '20 at 04:44
  • 1
    Don't iterate over for loop, `for f in direct:` the bug is over here. Since `os.path.abspath()` returns your path is `str` type and you are iterating over a string in the for loop. – Arpit Maiya Apr 03 '20 at 04:45
  • So, then, what do I do? I am desperate at this point. – Luck Box Apr 03 '20 at 04:46
  • If you want to iterate over files in the directory, use `files = os.listdir(direct)` then iterate over `files` variable. – Arpit Maiya Apr 03 '20 at 04:49
  • If you mean like `for item in os.listdir(direct):` that's covered in the answer below resulting in `FileNotFoundError: [Errno 2] File 01-22-2020.csv does not exist: '01-22-2020.csv'` – Luck Box Apr 03 '20 at 04:53

3 Answers3

1

With python or pandas when you use read_csv or pd.read_csv, both of them look into current working directory, by default where the python process have started. So you need to use os module to chdir() and take it from there.

import pandas as pd 
import os
print(os.getcwd())
os.chdir("<PATH TO DIRECTORY>")
print(os.getcwd())
df = pd.read_csv('<The Filename You want to read>')
print(df.head())
Dwij Sheth
  • 280
  • 1
  • 7
  • 20
  • How do I put in the path to directory? `C:\Users\USER\PycharmProjects\Corona Stats\TimeSeries` is giving me invalid escape sequence. – Luck Box Apr 03 '20 at 04:43
  • 1
    You need to add double \\ instead of a Single \ while passing a path in python – Dwij Sheth Apr 03 '20 at 04:48
  • `os.chdir('')` results in: `OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: ''` – Luck Box Apr 03 '20 at 04:59
  • 1
    You dont need to add the < > Brackets in the path, i had mentioned that as a placeholder – Dwij Sheth Apr 03 '20 at 05:09
1

IIUC: Try:

source = "C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries"
for filename in os.listdir(source):
    filepath = os.path.join(source, filename)
    if not os.path.isfile(filepath):
        continue

    df = pd.read_csv(filepath)
    df = df.replace(np.nan, 'Other', regex=True)
    if df.columns[0] == 'FIPS':
        print(df.columns)
        df = df.drop(['FIPS', 'Last_Update', 'Lat', 'Long_'], axis=1)
        df = df.rename(columns={'Admin2': 'County',
                                'Province_State': 'State',
                                'Country_Region': 'Country',
                                'Combined_Key': 'City'})
        df.to_csv(filepath)
    elif df.columns[0] == 'Province/State':
        print(df.columns)
        df = df.drop(['Last Update'], axis=1)
        df = df.rename(columns={'Province/State': 'State',
                                'Country/Region': 'Country'})
        df.to_csv(filepath)
Shubham Sharma
  • 68,127
  • 6
  • 24
  • 53
0

Here you're iterating over EACH letter in the path:

direct = 'C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries'

for f in direct:
    ...

If you want to get the files in the directory you should use something like:

for item in os.listdir(direct):
    ...

Personally I would use pathlib:

from pathlib import Path

direct = Path('C:/Users/USER/PycharmProjects/Corona Stats/TimeSeries')

for item in direct.glob('*'):
    ...
monkut
  • 42,176
  • 24
  • 124
  • 155