4

this might be an easy question to answer but somehow I don't get the solution. The problem is, that I have data from three measurement points. Two of them are using the dtype: datetime64[ns, UTC] But the last one uses dtype: datetime64[ns].

I came across many answers but nothing worked for me so far but maybe it's just my issue of not knowing how to work with pandas. So with pandas, I would like to merge the data along with the timestamp. But pandas doesn't let me because of the different timestamps with the following message:

ValueError: You are trying to merge on datetime64[ns, UTC] and datetime64[ns] columns. If you wish to proceed you should use pd.concat

What exactly do I have to type to convert my values correctly? I tried a lot like:

pd.to_datetime(df.time, unit='ns').dt.tz_localize('UTC')

or

pd.to_datetime(df, unit='ns').dt.tz_localize('UTC').dt.tz_convert('US/Eastern')

but nothing was working for me so far... I very much appreciate your help.

RoKa
  • 105
  • 6

2 Answers2

4

You can use also all_timezones from pytz.

import pandas as pd
from pytz import all_timezones
print(all_timezones[-5:])

You can see the last 5:

['UTC', 'Universal', 'W-SU', 'WET', 'Zulu']

I'll create an example dataframe:

date = pd.Series(pd.date_range('9/1/2020', periods=10, freq='D'))

Result:

0   2020-09-01
1   2020-09-02
2   2020-09-03
3   2020-09-04
4   2020-09-05
5   2020-09-06
6   2020-09-07
7   2020-09-08
8   2020-09-09
9   2020-09-10
dtype: datetime64[ns]

You can change it, for example:

date_with_utc = date.dt.tz_localize('Europe/London')
print(date_with_utc)

Result:

0   2020-09-01 00:00:00+01:00
1   2020-09-02 00:00:00+01:00
2   2020-09-03 00:00:00+01:00
3   2020-09-04 00:00:00+01:00
4   2020-09-05 00:00:00+01:00
5   2020-09-06 00:00:00+01:00
6   2020-09-07 00:00:00+01:00
7   2020-09-08 00:00:00+01:00
8   2020-09-09 00:00:00+01:00
9   2020-09-10 00:00:00+01:00
dtype: datetime64[ns, Europe/London]

All code:

import pandas as pd
from pytz import all_timezones

date = pd.Series(pd.date_range('9/1/2020', periods=10, freq='D'))
date_with_utc = date.dt.tz_localize('Europe/London')

print(date_with_utc)
Adrian
  • 380
  • 3
  • 13
  • It's working perfectly, thanks a lot. Unfortunately I can not upvote your answer because it says: Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score. – RoKa Sep 12 '20 at 12:27
  • Don't worry @RoKa. Thanks for your answer! Regards. – Adrian Sep 12 '20 at 12:28
1

Can you try to convert both the columns that you want to merge using this and try the merge after?

df1['time'] = pd.to_datetime(df1['time'], utc = True)
df2['time'] = pd.to_datetime(df2['time'], utc = True)
XXavier
  • 1,206
  • 1
  • 10
  • 14