I have about 800,000 rows of data in a dataframe, and one column of the data df['Date'] is string of time and date 'YYYY-MM-DD HH:MM:SS.fff', which doesn't have timezone information. However I know they are in New_York timezone and they need to be convert into CET. Now I have two methods to get the job done:
method 1 (very slow for sure):
df['Date'].apply(lambda x: timezone('America/New_York')\
.localize(datetime.datetime.strptime(x,'%Y%m%d%H:%M:%S.%f'))\
.astimezone(timezone('CET')))
method 2 :
df.index = pd.to_datetime(df['Date'],format='%Y%m%d%H:%M:%S.%f')
df.index.tz_localize('America/New_York').tz_convert('CET')
I am just wondering if there are any other better ways to do it? or any potential pitfalls of the methods I listed? Thanks!
Also, I would like to shift all timestamp by a fix amount of time, such as 1ms timedelta(0,0,1000)
, how can I implement it using method 2?