I am new to PySpark and I am trying to create a function that can be used across when inputted a column from String type to a timestampType.
This the input column string looks like: 23/04/2021 12:00:00 AM
I want this to be turned in to timestampType so I can get latest date using pyspark.
Below is the function I so far created:
def datetype_change(self, key, col):
self.log.info("datetype_change...".format(self.app_name.upper()))
self.df[key] = self.df[key].withColumn("column_name", F.unix_timestamp(F.col("column_name"), 'yyyy-MM-dd HH:mm:ss').cast(TimestampType()))
When I run it I'm getting an error:
NameError: name 'TimestampType' is not defined
How do I change this function so it can take the intended output?