2

I have some python 2.3.4 scripts to migrate to python 2.7.5 and I found a strange issue in the behavior of strptime.

The Script Sample convert a string in (week number, day, year) format to a datetime:

dw='51 0 18' # 51 week number , 0 for Sunday and 18 for year 2018 date=time.strptime(dw,"%U %w %y") print(date) The output in python 2.3.4:

(2018, 12, 16, 0, 0, 0, 6, 350, -1) # 2018 12 16

The output in python 2.7.5:

time.struct_time(tm_year=2018, tm_mon=12, tm_mday=23, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=6, tm_yday=357, tm_isdst=-1) # 2018 12 23

The behavior is different beginning from the second week number of the year (dw='2 0 18').

Is it a known issue of strptime or have I missed something?

BoarGules
  • 16,440
  • 2
  • 27
  • 44
adel
  • 95
  • 5

1 Answers1

4

There are two common week numbering systems, and strptime() has two directives, to handle both ISO week numbers (%W: weeks begin on Monday) and the week number system used in North America (%U: weeks begin on Sunday).

I'm not very familiar with the latter system, but I reckoned that Excel 2016 is, and when I checked I found that it agrees that in that system the Sunday in week 51 2018 is 16 December.

=WEEKNUM(DATE(2018,12,16))     --> 51

Wikipedia gives the method for determining week 1 in the North American system as follows: Week 1 begins on a Sunday and contains both 1 January and the first Saturday. Or, put another way, week 1 ends on the first Saturday in January.

So, up to 6 days of week 1 can actually fall in the previous year, and those days also count as being in week 53 of the previous year.

This table shows, for 7 years, the date of the first Sunday of the year, and to its left, the date of the preceding Sunday. So column 2 is 7 days after column 1. Excel 2016's WEEKNUM() function reports all the dates in column 2 as week 2:

Sunday falls on  Week 2 begins
---------------  -------------
26-Dec-2021      02-Jan-2022
27-Dec-2015      03-Jan-2016
28-Dec-2025      04-Jan-2026
29-Dec-2019      05-Jan-2020
30-Dec-2018      06-Jan-2019
31-Dec-2017      07-Jan-2018
01-Jan-2017      08-Jan-2017

If I ask Python 2.7 or 3.7 for the Sunday in week 1 of these years, like this:

for year in (2022,2016,2026,2020,2019,2018,2017):
    print(time.strftime("%d-%b-%Y",time.strptime("{year} 1 0".format(year=year), "%Y %U %w")))

I get

02-Jan-2022
03-Jan-2016
04-Jan-2026
05-Jan-2020
06-Jan-2019
07-Jan-2018
01-Jan-2017

So, for the Python standard library's %U directive, week 1 begins on the first Sunday in January, rather than ending on the first Saturday. That is a reasonable approach, just a different one. The difference means that %U week numbers agree with Excel week numbers only in years where 1 January is a Sunday. In all other years, including 2018 as you report, %U will give a week number that is one less.

I reported this as a bug on bugs.python.org, issue 35535. The consensus there appears to be that the current behaviour is consistent with the documentation. That Python 2.3 agrees with Excel and Wikipedia, and Python 2.7 does not, seems to be regarded as unpersuasive.

So, if it wasn't before, it is a known issue now.

BoarGules
  • 16,440
  • 2
  • 27
  • 44
  • 1
    The day is different for the OP. – Tomalak Dec 18 '18 at 08:41
  • Thank you for the detailed and clear explanation of the bug. In addition to finding a workaround to get my code works as before (Python 2.3) I am interested in understanding the reference to the glibc made in the answers to the 35535 issue. – adel Jan 02 '19 at 12:49
  • 1
    That reference by Paul Ganssle to `glibc` is just wrong. He was talking about `strftime()` and Python's `strftime()` does indeed depend on the underlying implementation of `strftime()` in the C library. But your question was about `strptime()` and, as the Python docs make clear: "`strptime()` is independent of any platform" https://docs.python.org/3/library/time.html#module-time. – BoarGules Jan 02 '19 at 13:07