I have a column in my Pandas dataframe called "date" that contains unix timestamps (int64). I am trying to iterate over the entire frame and extract month and year from the timestamps and add them to my dataframe. Once I have the month and year, I want to be able to create masks so that I can save to CSV new dataframes based on months and years Here is the code that I have written:
# import useful libraries
from datetime import datetime
import pandas as pd
# read csv as dataframe
df=pd.read_csv('./ct.csv')
# function to get year
def get_year(x):
return datetime.fromtimestamp(x).strftime("%Y")
# function to get month
def get_month(x):
return datetime.fromtimestamp(x).strftime("%m")
# add month and year to new dataframe columns
df['year'] = df['date'].apply(get_year)
df['month'] = df['date'].apply(get_month)
# set the beginning and end date for mask
beginning = datetime(2002, 1, 1)
end = datetime(2003, 1, 1)
# get datetime from timestamp
def to_datetime(x):
print(x)
return datetime.fromtimestamp(x)
# create datetime series
df['datetime'] = df['date'].apply(to_datetime)
# create dataframe mask
msk = (df['datetime'] > beginning) & (df['datetime'] < end)
# apply mask
df_range = df[msk]
# write dataframe to csv
df_range.to_csv('ct_2002.csv', index=False)
I am getting the following error when trying to run this:
runfile('C:/Users/x/Desktop/Wine/daterange.py', wdir='C:/Users/x/Desktop/Wine')
Traceback (most recent call last):
File "C:\Users\x\Desktop\Wine\daterange.py", line 17, in <module>
df['year'] = df['date'].apply(get_year)
File "C:\Users\x\Anaconda3\lib\site-packages\pandas\core\series.py", line 3848, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas\_libs\lib.pyx", line 2329, in pandas._libs.lib.map_infer
File "C:\Users\x\Desktop\Wine\daterange.py", line 10, in get_year
return datetime.fromtimestamp(x).strftime("%Y")
OSError: [Errno 22] Invalid argument
Any help would be much appreciated.