I am using this code to pull in two CSVs of a similar naming convention, place their filenames in a "File" column and concatenate the dataframes into one dataframe called NatHrs.
import glob
from pathlib import Path
path = r'C:\Users\ThisUser\Desktop\AC Mbr Analysis'
all_files = glob.glob(path + '\\Natl_hours_YTD_OC_*.csv')
Nat_dfs = []
for file in all_files:
df = pd.read_csv(file, index_col=None, encoding='windows-1252', header=1 )
df['File'] = file
Nat_dfs.append(df)
NatHrs = pd.concat(Nat_dfs)
Now, I want to take the "File" column, which returns a filename object with entries looking like "C:\Users\ThisUser\Desktop\AC Mbr Analysis\Natl_hours_YTD_OC_2018-2019", extract ONLY the end of the filename--in this case "2018-2019"--and place those characters into a new colum "Program Year", reflecting the entry "2018-2019". I am not having success with manipulating strings or series--should I be using path.replace? I am lost. When I describe the column I'm looking to parse...
NatHrs['File'].describe
...I get this:
Name: File, dtype: object>