I'm trying to convert a column of GMT timestamp strings into a column of timestamps in Eastern timezone. I want to take daylight savings into account.
My column of timestamp strings look like this:
'2017-02-01T10:15:21+00:00'
I figured out how to convert the string column into a timestamp in EST:
from pyspark.sql import functions as F
df2 = df1.withColumn('datetimeGMT', df1.myTimeColumnInGMT.cast('timestamp'))
df3 = df2.withColumn('datetimeEST', F.from_utc_timestamp(df2.datetimeGMT, "EST"))
But the times don't change with daylight savings. Is there another function or something that accounts for daylight savings with converting the timestamps?
EDIT: I think I figured it out. In the from_utc_timestamp call above, I needed to use "America/New_York" instead of "EST":
df3 = df2.withColumn('datetimeET', F.from_utc_timestamp(df2.datetimeGMT, "America/New_York"))