2

If I run the following git log command (here, in this repo: https://github.com/rubyaustralia/rubyconfau-2013-cfp):

$ git --no-pager log --reverse --date=raw --pretty='%ad %h'
1344507869 -0700 314b3d4
1344508222 +1000 dffde53
1344510528 +1000 17e7d3b
...

... I get a list, where I have both Unix timestamp (seconds since Epoch), and a UTC offset, for every commit. What I would like to do, is to obtain a timezone aware datetime, that will:

  • Show me the date/time as the commit author saw it at the time (as per the recorded UTC time)
  • Show me the date/time as I would have seen it in my local timezone

In the first case, all I have is a UTC offset, not the author's time zone - and as such I'd have no information about possible daylight savings changes.

In the second case, my OS would most likely be set up to a certain locale including a (geographical) timezone, which would be aware of DST changes; say CET timezone has UTC offset of +0100 in winter, but in the summer daylight saving, it has UTC offset of +0200 (and is then called CEST)

In any case, I'd want to start with a UTC timestamp, because the "1344508222" count of epoch seconds is independent from timezones; the offset +1000 would simply help us see the human-readable output hopefully as the author saw it.

I need to do this for a Python 2.7 project, and I scoured through a ton of resources (SO posts), - and I came up with the following example (which attempts to parse the second line from the above snippet, "1344508222 +1000 dffde53"). However, I'm really not sure if it is right; so ultimately, my question would be - what would be the right way to do this?

Preamble:

#!/usr/bin/env python2
# -*- coding: utf-8 -*-

import datetime
import pytz
import dateutil.tz
import time

def getUtcOffsetFromString(in_offset_str): # SO:1101508
  offset = int(in_offset_str[-4:-2])*60 + int(in_offset_str[-2:])
  if in_offset_str[0] == "-":
    offset = -offset
  return offset

class FixedOffset(datetime.tzinfo): # SO:1101508
  """Fixed offset in minutes: `time = utc_time + utc_offset`."""
  def __init__(self, offset):
    self.__offset = datetime.timedelta(minutes=offset)
    hours, minutes = divmod(offset, 60)
    #NOTE: the last part is to remind about deprecated POSIX GMT+h timezones
    #  that have the opposite sign in the name;
    #  the corresponding numeric value is not used e.g., no minutes
    self.__name = '<%+03d%02d>%+d' % (hours, minutes, -hours)
  def utcoffset(self, dt=None):
    return self.__offset
  def tzname(self, dt=None):
    return self.__name
  def dst(self, dt=None):
    return datetime.timedelta(0)
  def __repr__(self):
    return 'FixedOffset(%d)' % (self.utcoffset().total_seconds() / 60)

Start of parsing:

tstr = "1344508222 +1000 dffde53"
tstra = tstr.split(" ")
unixepochsecs = int(tstra[0])
utcoffsetstr = tstra[1]
print(unixepochsecs, utcoffsetstr)  # (1344508222, '+1000')

Get UTC timestamp - first I attempted to parse the string 1528917616 +0000 with dateutil.parser.parse:

justthetstz = " ".join(tstra[:2])
print(justthetstz)  # '1344508222 +1000'
#print(dateutil.parser.parse(justthets)) # ValueError: Unknown string format

... but that unfortunately fails.

This worked to get UTC timestamp:

# SO:12978391: "datetime.fromtimestamp(self.epoch) returns localtime that shouldn't be used with an arbitrary timezone.localize(); you need utcfromtimestamp() to get datetime in UTC and then convert it to a desired timezone"
dtstamp = datetime.datetime.utcfromtimestamp(unixepochsecs).replace(tzinfo=pytz.utc)
print(dtstamp)                # 2012-08-09 10:30:22+00:00
print(dtstamp.isoformat())    # 2012-08-09T10:30:22+00:00 # ISO 8601

Ok, so far so good - this UTC timestamp looks reasonable.

Now, trying to get the date in author's UTC offset - apparently a custom class is needed here:

utcoffset = getUtcOffsetFromString(utcoffsetstr)
fixedtz = FixedOffset(utcoffset)
print(utcoffset, fixedtz)   # (600, FixedOffset(600))
dtstampftz = dtstamp.astimezone(fixedtz)
print(dtstampftz)             # 2012-08-09 20:30:22+10:00
print(dtstampftz.isoformat()) # 2012-08-09T20:30:22+10:00

This looks reasonable too, 10:30 in UTC would be 20:30 in +1000; then again, an offset is an offset, no ambiguity here.

Now I'm trying to derive the datetime in my local timezone - first, it looks like I shouldn't use the .replace method:

print(time.tzname[0]) # CET
tzlocal = dateutil.tz.tzlocal()
print(tzlocal) # tzlocal()
dtstamplocrep = dtstamp.replace(tzinfo=tzlocal)
print(dtstamp)                # 2012-08-09 10:30:22+00:00
print(dtstamplocrep)          # 2012-08-09 10:30:22+02:00 # not right!

This doesn't look right, I got the exact same "clock string", and different offsets.

However, .astimezone seems to work:

dtstamploc = dtstamp.astimezone(dateutil.tz.tzlocal())
print(dtstamp)                # 2012-08-09 10:30:22+00:00
print(dtstamploc)             # 2012-08-09 12:30:22+02:00 # was August -> summer -> CEST: UTC+2h

I get the same with a named pytz.timezone:

cphtz = pytz.timezone('Europe/Copenhagen')
dtstamploc = dtstamp.astimezone(cphtz)
print(dtstamp)                # 2012-08-09 10:30:22+00:00
print(dtstamploc)             # 2012-08-09 12:30:22+02:00 # is August -> summer -> CEST: UTC+2h

... however, I cannot use .localize here, since my input dtstamp already has a timezone associated with it, and is therefore not "naive" anymore:

# dtstamploc = cphtz.localize(dtstamp, is_dst=True) # ValueError: Not naive datetime (tzinfo is already set)

Ultimately, so far this looks correct, but I'm really uncertain about it - especially since I got to see this:

pytz.astimezone not accounting for daylight savings?

You can't assign the timezone in the datetime constructor, because it doesn't give the timezone object a chance to adjust for daylight savings - the date isn't accessible to it. This causes even more problems for certain parts of the world, where the name and offset of the timezone have changed over the years.

From the pytz documentation:

Unfortunately using the tzinfo argument of the standard datetime constructors ‘’does not work’’ with pytz for many timezones.

Use the localize method with a naive datetime instead.

... which ended up confusing me: say I want to do this, and I already have a correct timezoned timestamp, - how would I derive a "naive" datetime for it? Just get rid of the timezone info? Or is the right "naive" datetime derived from version of the timestamp expressed in UTC (e.g. 2012-08-09 20:30:22+10:00 -> 2012-08-09 10:30:22+00:00, and so the right "naive" datetime would be 2012-08-09 10:30:22)?

sdaau
  • 36,975
  • 46
  • 198
  • 278
  • 1
    I'm confused. You have the naive datetime at first. Why you need to derive it again? Why not use that naive datetime object to generate two datetime objects in two different timezones? – Sraw Dec 25 '18 at 06:59
  • Thanks @Sraw - that's what I don't understand: is a UTC timestamp (with `tzinfo=pytz.utc`) a naive datetime, or not? I was thinking it is not, because it has an associated timezone info (that is, `pytz.utc`)? – sdaau Dec 25 '18 at 07:10
  • 1
    No, it isn't. A naive datetime object is one without timezone info. So I think the solution is simple: First, get a naive datetime object from that timestamp. Second, genereate two different datetime objects with timezone you need using `timezone_you_need.localize`. – Sraw Dec 25 '18 at 07:14
  • Thanks again @Sraw - good to have that clarified! Just one more thing - should the naive timestamp be derived from UTC one? I mean, if from my integer timestemp I get the +10 offset representation `2012-08-09T20:30:22+10:00`, and then use (2012,08,09,20,30,22) as the naive datetime, and then try to convert it again to +10, I presume I won't get 20:30:22 as the clock again - which makes me think, it is the UTC representation that the naive datetime should be derived from? – sdaau Dec 25 '18 at 07:19
  • 1
    It is what "naive" means. A naive datetime object doesn't contain timezone info, so that means it is ambiguous. You can use `replace` to convert it directly into `+10` timezone. It will still give you `2012-08-09T20:30:22+10:00`. But as we all like clear things, `pytz` treats a naive datetime as utc time. Or further, honestly built-in datetime package is not a good design. At least this ambiguous naive datetime object is bad :( And that's also why we need `pytz` and some other packages.. – Sraw Dec 25 '18 at 07:35

0 Answers0