15

I have data points whose abscissas are datetime.datetime objects with a time zone (their tzinfo happens to be a bson.tz_util.FixedOffset obtained through MongoDB).

When I plot them with scatter(), what is the time zone of the tick labels?

Changing the timezone in matplotlibrc does not change anything in the displayed plot (I must have misunderstood the discussion on time zones in the Matplotlib documentation).

I experimented a little with plot() (instead of scatter()). When given a single date, it plots it and ignores the time zone. However, when given multiple dates, it uses a fixed time zone, but how is it determined? I can't find anything in the documentation.

Finally, is plot_date() supposed to be the solution to these time zone problems?

Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
  • 4
    It looks like `axes.xaxis_data(tz)` will set all the dates to be displayed in _that_ time zone. If you don't explicitly set the time zone it looks like it will convert the times to your local time zone (I just skimmed the code, I could be way off). – tacaswell Mar 07 '14 at 16:36
  • 1
    Yeah, it looks like the documentation lies.... – tacaswell Mar 07 '14 at 16:37
  • 1
    Thanks. When a single time is plotted by `scatter()` it just ignores the time zone (it does *not* use the local time zone)… Where does the documentation "lie"? I only see a lack of information regarding the displayed time zone. – Eric O. Lebigot Mar 08 '14 at 02:28
  • What I think is the lie is the claim that it pays attention to the rcparam, from looking at the code it looks like it defaults to what ever datetime does (which I assumed was the local timezone, but that is apparently wrong). – tacaswell Mar 08 '14 at 23:01
  • The actual method to use is ``axes.xaxis_date(tz)``, [doc for this method](http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.xaxis_date) – sberder Nov 14 '15 at 01:03
  • 1
    see als https://stackoverflow.com/a/45728316/288875 – Andre Holzner Aug 17 '18 at 15:20

2 Answers2

11

The question was already answered in the comments sort of. However I was still struggling with timezones myself. To get it clear I tried all combinations. I think you have two main approaches depending on if your datetime objects are already in the desired timezone or are in a different timezone, I tried to describe them below. It's possible that I still missed/mixed something..

Timestamps (datetime objects): in UTC Desired display: in specific timezone

  • Set the xaxis_date() to your desired display timezone (defaults to rcParam['timezone'] which was UTC for me)

Timestamps (datetime objects): in a specific timezone Desired display: in a different specific timezone

  • Feed your plot function datetime objects with the corresponding timezone (tzinfo=)
  • Set the rcParams['timezone'] to your desired display timezone
  • Use a dateformatter (even if you are satisfied with the format, the formatter is timezone aware)

If you are using plot_date() you can also pass in the tz keyword but for a scatter plot this is not possible.

When your source data contains unix timestamps, be sure to choose wisely from datetime.datetime.utcfromtimestamp() and without utc: fromtimestamp()if you are going to use matplotlib timezone capabilities.

This is the experimenting I did (on scatter() in this this case), it's a bit hard to follow maybe, but just written here for anyone who would care. Notice at what time the first dots appear (the x axis does not start on the same time for each subplot): Different combinations of timezones

Sourcecode:

import time,datetime,matplotlib
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.dates as mdates
from dateutil import tz


#y
data = np.array([i for i in range(24)]) 

#create a datetime object from the unix timestamp 0 (epoch=0:00 1 jan 1970 UTC)
start = datetime.datetime.fromtimestamp(0)  
# it will be the local datetime (depending on your system timezone) 
# corresponding to the epoch
# and it will not have a timezone defined (standard python behaviour)

# if your data comes as unix timestamps and you are going to work with
# matploblib timezone conversions, you better use this function:
start = datetime.datetime.utcfromtimestamp(0)   

timestamps = np.array([start + datetime.timedelta(hours=i) for i in range(24)])
# now add a timezone to those timestamps, US/Pacific UTC -8, be aware this
# will not create the same set of times, they do not coincide
timestamps_tz = np.array([
    start.replace(tzinfo=tz.gettz('US/Pacific')) + datetime.timedelta(hours=i)
    for i in range(24)])


fig = plt.figure(figsize=(10.0, 15.0))


#now plot all variations
plt.subplot(711)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("1 - tzinfo NO, xaxis_date = NO, formatter=NO")


plt.subplot(712)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("2 - tzinfo YES, xaxis_date = NO, formatter=NO")


plt.subplot(713)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("3 - tzinfo NO, xaxis_date = YES, formatter=NO")


plt.subplot(714)
plt.scatter(timestamps, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))
plt.gca().set_title("4 - tzinfo NO, xaxis_date = YES, formatter=YES")


plt.subplot(715)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("5 - tzinfo YES, xaxis_date = YES, formatter=NO")


plt.subplot(716)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().set_title("6 - tzinfo YES, xaxis_date = NO, formatter=YES")
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))


plt.subplot(717)
plt.scatter(timestamps_tz, data)
plt.gca().set_xlim([datetime.datetime(1970,1,1), datetime.datetime(1970,1,2,12)])
plt.gca().xaxis_date('US/Pacific')
plt.gca().set_title("7 - tzinfo YES, xaxis_date = YES, formatter=YES")
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%H:%M(%d)'))

fig.tight_layout(pad=4)
plt.subplots_adjust(top=0.90)

plt.suptitle(
    'Matplotlib {} with rcParams["timezone"] = {}, system timezone {}"
    .format(matplotlib.__version__,matplotlib.rcParams["timezone"],time.tzname))

plt.show()
Jesse Aldridge
  • 7,991
  • 9
  • 48
  • 75
Sebastian
  • 5,471
  • 5
  • 35
  • 53
  • 2
    vote up for the DateFormatter reminder: must specify tz param – Robin Loxley Feb 14 '17 at 06:17
  • It looks like what matters is to set the `tzinfo` of the dates? – Eric O. Lebigot Aug 14 '17 at 10:56
  • 2
    For example 6 I'd suggest to provide the timezone to the formatter `from pytz import timezone; formatter = mdates.DateFormatter("%H:%M(%d)") ; formatter.set_tzinfo(timezone('US/Pacific')); plt.gca().xaxis.set_major_formatter(formatter)` – pseyfert Feb 12 '18 at 11:24
9

If, like me, you are coming to this question while trying to get a timezone-aware pandas DataFrame to plot correctly, @pseyfert 's comment to use a formatter with timezone is also right on the money. Here is an example for pandas.plot, showing some points while transitioning from EST to EDT:

df = pd.DataFrame(
    dict(y=np.random.normal(size=5)),
    index=pd.DatetimeIndex(
        start='2018-03-11 01:30',
        freq='15min',
        periods=5,
        tz=pytz.timezone('US/Eastern')))

Notice how the timezone changes as we transition to daylight savings:

> [f'{t:%T %Z}' for t in df.index]
['01:30:00 EST',
 '01:45:00 EST',
 '03:00:00 EDT',
 '03:15:00 EDT',
 '03:30:00 EDT']

Now, plot it:

df.plot(style='-o')
formatter = mdates.DateFormatter('%m/%d %T %Z', tz=df.index.tz)
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()

enter image description here


PS:

Not sure why some of the dates (the EST ones) look like they are in bold, but presumably the internals of matplotlib renders the labels more than once and the position changes by one pixel or two... The following confirms that the formatter is called several times for the same timestamps:

class Foo(mdates.DateFormatter):
    def __init__(self, *args, **kwargs):
        super(Foo, self).__init__(*args, **kwargs)

    def strftime(self, dt, fmt=None):
        s = super(Foo, self).strftime(dt, fmt=fmt)
        print(f'out={s} for dt={dt}, fmt={fmt}')
        return s

And check out the output of:

df.plot(style='-o')
formatter = Foo('%F %T %Z', tz=df.index.tz)
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()
Pierre D
  • 24,012
  • 7
  • 60
  • 96