I have a large data set with two datetime columns and as I’m trying to clean it, I created a function that apply to each cell of these two columns:
#dropoff_datetime
def date_time_dropoff(cell):
if pd.isnull(cell) == True:
return('Missing Date Time dropoff')
elif type(cell) == datetime.datetime:
return ('Clean Date Time dropoff')
elif type(cell) != datetime.datetime:
return ('Type error Date Time dropoff')
When I run it first time on my dataframe, it returns that the cells have the wrong type (indeed, I see that it is of type object when I use df.dtypes). So next I convert the df using pandas method:
df2['Statue_dropoff_datetime'] = df['dropoff_datetime'].apply(date_time_dropoff)
and run the function again, but it stills return Type error because it is not type datetim.datetime. However, once I use df.dtypes after the conversion I obtain the correct datetimetype:
dropoff_datetime datetime64[ns]
Without success, I tried answers suggested for similar questions dealing with convesion, mainly: Converting object column in pandas dataframe to datetime Convert Pandas Column to DateTime Converting object to datetime format in python How to change the datetime format in pandas
Btw, when I import at the beginning, I use:
import datetime
Here's my df:
dropoff_datetime
2020-06-10 11:16:00
2020-06-21 11:53:06
2020-06-09 13:25:21
2020-06-01 19:06:05
#dropoff_datetime
def date_time_dropoff(cell):
if pd.isnull(cell) == True:
return('Date Time dropoff missing')
elif type(cell) == datetime:
return ('Date Time dropoff clean')
elif type(cell) != datetime:
return ('Date Time dropoff Type error')
df1[‘Statue_dropoff_datetime'] = df['dropoff_datetime'].apply(date_time_dropoff)
and when I print:
print(pd.value_counts(df1.values.ravel()))
Type error Date Time dropoff 47703
while using df.dtypes
gives:
df.dtypes
dropoff_datetime datetime64[ns]