0

I have a Pandas Series ("timeSeries") that includes a time of day. Some of the items are blank, some are actual times (08:00; 13:00), some are indications of time (morning, early afternoon).

As the time of day I have is New York, I would like to convert the items in the time format to London time. Using pd.to_datetime(timeSeries, error='ignore') does not work when I also have the addition of timedelta(hours=5). So I attempted to add a if condition but it does not work.

Sample Initial DataFrame:

dfNY = pd.DataFrame({'TimeSeries': [13:00, nan, 06:00, 'Morning', 'Afternoon', nan, nan, 01:30])

Desired Result:

dfLondon = pd.DataFrame({'TimeSeries': [18:00, nan, 11:00, 'Morning', 'Afternoon', nan, nan, 06:30])

Any help or simplification of my code would be great.

london = dt.datetime.now(timezone("America/New_York"))
newYork = dt.datetime.now(timezone("Europe/London"))
timeDiff = (london - dt.timedelta(hours = newYork.hour)).hour

for dayTime in timeSeries: 
     if dayTime == "%%:%%": 
        print(dayTime)
        dayTime = pd.to_datetime(dayTime) + dt.timedelta(hours=timeDiff)
return timeSeries

Update: using pytz method in comment below yields a timezone that is off my 5min. How do we fix this?

ohoh7171
  • 186
  • 2
  • 2
  • 14

1 Answers1

0

Using the .dt accessor, you can set a timezone to your value, and than convert it to another one, using tz.localize and tz_convert.

import pandas as pd
import numpy as np

pd.options.display.max_columns = 5

df = pd.DataFrame({'TimeSeries': ["13:00", np.nan, "06:00", 'Morning', 'Afternoon', np.nan, np.nan, "01:30"]})

#   Convert your data to datetime, errors appears, but we do not care about them.
#   We also explicitly note that the datetime is a specific timezone.
df['TimeSeries_TZ'] = pd.to_datetime(df['TimeSeries'], errors='coerce', format='%H:%M')\
                     .dt.tz_localize('America/New_York')
print(df['TimeSeries_TZ'])
# 0   1900-01-01 13:00:00-04:56
# 1                         NaT
# 2   1900-01-01 06:00:00-04:56
# 3                         NaT
# 4                         NaT
# 5                         NaT
# 6                         NaT
# 7   1900-01-01 01:30:00-04:56

#   Then, we can use the datetime accessor to convert the timezone.
df['Converted_time'] = df['TimeSeries_TZ'].dt.tz_convert('Europe/London').dt.strftime('%H:%M')
print(df['Converted_time'])
# 0    17:55
# 1      NaT
# 2    10:55
# 3      NaT
# 4      NaT
# 5      NaT
# 6      NaT
# 7    06:25

#   If you want to convert the original result that CAN be converted, while keeping the values that
#   raised errors, you can copy the original data, and change the data that is not equal to the value
#   that means an error was raised, e.g : NaT (not a timestamp).
df['TimeSeries_result'] = df['TimeSeries'].copy()
df['TimeSeries_result'] = df['TimeSeries'].where(~df['Converted_time'].ne('NaT'), df['Converted_time'])


print(df[['TimeSeries', 'TimeSeries_result']])
#   TimeSeries TimeSeries_result
# 0      13:00             17:55
# 1        NaN               NaN
# 2      06:00             10:55
# 3    Morning           Morning
# 4  Afternoon         Afternoon
# 5        NaN               NaN
# 6        NaN               NaN
# 7      01:30             06:256          06:25             06:25
IMCoins
  • 3,149
  • 1
  • 10
  • 25
  • Hey thanks for the help! The issue with this output is that, as you can see - the TimeSeries result is off by 5 minutes. How can we fix this? – ohoh7171 Oct 01 '19 at 09:30
  • @user2266957 The four-minute offset is explained here:https://stackoverflow.com/a/41304707/1678467 – tnknepp Oct 01 '19 at 13:57