I have written a UDF to replace a few specific date values in a column named "latest_travel_date" with 'NA'. However, this column also contains many null values, so I have handled this also in the UDF. (please see below)
Query:
def date_cleaner(date_col):
if type(date_col) == NoneType:
pass
else:
if year(date_col) in ('1899','1900'):
date_col= 'NA'
else:
pass
return date_col
date_cleaner_udf = udf(date_cleaner, DateType())
Df3= Df2.withColumn("latest_cleaned", date_cleaner_udf("latest_travel_date"))
However, I am continuously getting the error: NameError: global name 'NoneType' is not defined
Can anyone please help me to resolve this?