I am very new to coding (this is the first code I am writing).
I have multiple csv files, all with the same headers. The files correspond to hourly ozone concentration for every day of the year, and each file is a separate year [range from 2009-2020]. I have a column 'date' that contains the year-month-day, and I have a column for hour of the day (0-23). I want to separate the year from the month-day, combine the hour with the month-day and make this the index, and then merge the other csv files into one dataframe.
In addition, I need to average data values from each day at each hour for all 10 years, however, three of my files include leap days (an extra 24 values). I would appreciate any advice on how to account for the leap years. I assume that I would need to add the leap day to the files without it, then provide null values, then drop the null values (but that seems circular).
Also, if you have any tips on how to simplify my process, feel free to share!
Thanks in advance for your help.
Update: I tried the advice from Rookie below, but after importing csv data, I get an error message:
import pandas as pd
import os
path = "C:/Users/heath/Documents/ARB project Spring2020/ozone/SJV/SKNP"
df = pd.DataFrame()
for file in os.listdir(path):
df_temp = pd.read_csv(os.path.join(path, file))
df = pd.concat((df, df_temp), axis = 0)
First, I get an error message that says OSError: Initializing from file failed
.
I tried to fix the issue by adding engine = 'python'
based on advice from OSError: Initializing from file failed on csv in Pandas, but now I'm getting PermissionError: [Errno 13] Permission denied: 'C:/Users/heath/Documents/ARB project Spring2020/ozone/SJV/SKNP\\.ipynb_checkpoints'
Please help, I'm not sure what else to do. I edited the permission so that everyone has the read & write access. However, I still had the "permission denied" error when I imported the csv on Windows.