3

Working with zipfile module I found something weird about how it works.

I'm zipping one file, which last modified attr time is: 13:40:31 (HH:MM:SS) When I zip and unzip the file, its last mod time is 13:40:30 (lost 1 second)

Doing some tests around this, I used ZipInfo object to manually set the last modified time to 13:40:31 but still get 13:40:30.

I also tried setting to 13:40:41 and then I got 13:40:40.

Trying any other value to seconds, it works fine, so if I set it to 13:40:32, it's ok when unzip the file.

Any clue about this? Am I missing something?

OS: Windows 10 (64 bits) Python: 3.7

Test Just compress any file and then unzip it and compare last modified time

file = 'testfile.txt'

zf = zipfile.ZipFile(file='test.zip', mode='w', compression=zipfile.ZIP_DEFLATED)

info = zipfile.ZipInfo(file, 
    date_time=(2020, 9, 23, 13, 40, 31))

zf.writestr(info, open(file, 'r').read(), zipfile.ZIP_DEFLATED, 6)
zf.close()
Dharman
  • 30,962
  • 25
  • 85
  • 135
webbi
  • 841
  • 7
  • 14

1 Answers1

2

[EDIT: Updated to document Linux & Windows behaviour]

Legacy Behaviour

By default zip files store timestamps to a 2 second accuracy. This dates waaay back in time to when DOS ruled the world and every bit counted. Below is the definition of how it works from the Zip spec (APPNOTE.TXT)

4.4.6 date and time fields: (2 bytes each)   The date and time are encoded in standard MS-DOS format. If input came from standard input, the date and time are those at which compression was started for this data. If encrypting the central directory and general purpose bit flag 13 is set indicating masking, the value stored in the Local Header will be zero. MS-DOS time format is different from more commonly used computer time formats such as UTC. For example, MS-DOS uses year values relative to 1980 and 2 second precision.

  Although the default legacy 2-second precision timestamp is still present in all zip files, most modern zip implementations also use one (or more) extended attributes to store the timestamp accurately to (at least) one second accuracy. These extended attributes take priority over the legacy 2-second precision timestamp in applications that support them.

Looks like Python doesn't currently support these extended attributes. See Issue 49707 for the details.

A well-known exception to the support for better datetime support is the Windows right-click/Send-To/Compressed-folder -- that still only supports only the old legacy 2 seconds granularity.

Linux/MacOS

On Linux (and some Windows) zip applications the predominant datetime extension, called the Extended Timestamp Extra Field, stores one or more of the modification, access & creation times in standard Unix/Linux format, namely the elapsed number of seconds since 1 January 1970 00:00:00 UTC. See "Extended Timestamp Extra Field" in extrafld.txt for the full details.

Windows

Some Windows zip implementation use the "NTFS" attributes extension to store the modification, creation and access time as a 64-bit value. The definition of the 64-bit value is shown below (taken from §4.5.5, "NTFS Extra Field (0x000a)" in APPNOTE.TXT)

They determine the number of 1.0E-07 seconds (1/10th microseconds!) past WinNT "epoch", which is "01-Jan-1601 00:00:00 UTC".

One final point - it is valid for zip files to have both the Linux & Windows extended timestamp extensions at the same time. The unzipping application decides which to use based on the OS it is running on.

pmqs
  • 3,066
  • 2
  • 13
  • 22
  • thanks for your reply, didn't know about those 2 secs of accuracy. I should look for other lib to zip files then – webbi Sep 24 '20 at 18:31
  • Just found some more info, it was reported as bug, but appear to be ignored as zipfile follow the ZIP standards: https://bugs.python.org/issue5457 – webbi Sep 24 '20 at 19:24
  • 1
    That ticket is only discussing the legacy date/time field that has the 2 seconds accuracy. The extensions that support more accurate time in zip file also part of the zip standard. Most modern commandline implementations support the extensions for both read & write of zip files. – pmqs Sep 24 '20 at 20:10