3

I know that sometimes when you're converting between timezones Python gets confused about what the result should be, because timezones are hard.

from pandas import Timestamp

string = "1900-01-01 00:00:00"
ts = Timestamp(string, tz='US/Eastern')
print(ts)

Timestamp('1900-01-01 00:00:00-0456', tz='US/Eastern')

Obviously the offset should not be four hours and 56 minutes.

When it gets it wrong, is there a way to insist on what you the utcoffset should be?

I'm only converting between 'US/Eastern' and 'UTC', so the offset should only ever be four or five hours. What I'd like to do is check to see if the offset is an integer number of hours, and then round to the nearest number if not.

Batman
  • 8,571
  • 7
  • 41
  • 80

1 Answers1

4

Before 1901-12-13 20:45:52, the utcoffset was 4 hours and 56 minutes.

You can confirm this using pytz which uses the Olson database. This is the same module that Pandas uses to perform timezone calculations:

import pytz
eastern = pytz.timezone('US/Eastern')
for utcdate, info in zip(
        eastern._utc_transition_times, eastern._transition_info):
    utcoffset, dstoffset, tzabbrev = info
    print('{} | {} '.format(utcdate, utcoffset.total_seconds()))

This prints all the utc transition boundaries and utcoffets (in seconds) for the US/Eastern timezone. The first few lines look like this

0001-01-01 00:00:00 | -17760.0 
1901-12-13 20:45:52 | -18000.0 
1918-03-31 07:00:00 | -14400.0 
1918-10-27 06:00:00 | -18000.0 
1919-03-30 07:00:00 | -14400.0 
...

So before 1901-12-13 20:45:52, the utcoffset was -17760 seconds (or, equivalently, 4 hours and 56 minutes).


The standard way to make a timezone-aware date from a localtime using pytz is to call the localize method:

import datetime as DT
import pytz
eastern = pytz.timezone('US/Eastern')
date = DT.datetime(1900,1,1)
local_date = eastern.localize(date)
print(local_date)

prints

1900-01-01 00:00:00-04:56

This confirms that the Timestamp returned by Pandas is correct:

import pandas as pd
string = "1900-01-01 00:00:00"
ts = pd.Timestamp(string, tz='US/Eastern')
print(ts)
# 1900-01-01 00:00:00-04:56
unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
  • If I had the authority to grant you a wizard hat... I would! – piRSquared Dec 23 '16 at 16:24
  • Well that's a thing I learned today. And here I was blaming `pytz`. Out of curiosity, do you (or anyone else) happen to know why the offset was four hours and 56 minutes? – Batman Dec 23 '16 at 16:33
  • 2
    Back in the 19th century (and early 20th) people used [local mean time](https://en.wikipedia.org/wiki/Local_mean_time) instead of standard time. – unutbu Dec 23 '16 at 16:34
  • Cheers mate. Very much appreciated, – Batman Dec 23 '16 at 16:37