6

I have following code in python:

>>> import zipfile
>>> zip = zipfile.ZipFile('abc.zip', 'w')
>>> zip.writestr('myfile', 'This is sample text')
>>> zip.writestr('myfile', 'This is sample text')
>>> zip.close()

This creates an archive with two files having exactly same name and path.

Why is this?

enter image description here

Dharman
  • 30,962
  • 25
  • 85
  • 135
Kshitiz Sharma
  • 17,947
  • 26
  • 98
  • 169
  • 1
    How can you two files, with precisely the same name and path? – msvalkon Mar 03 '14 at 10:17
  • Looks like it creates two files with the same name and path in the archive when I try it, too. There's all the code necessary to reproduce it right in the question, so I don't understand the "lacks sufficient information to diagnose the problem" close vote. – user2357112 Mar 03 '14 at 10:21
  • The zip format allows multiple files with the same name. Though I can't seem to find the appropriate documentation for that feature. If anyone finds a link, I would like to check that information. – Depado Mar 03 '14 at 10:25
  • This is an somewhat annoying feature of the archive formats. For example, Tar allows that as well, the file names are actually not used to identify the files. That's why you cannot rely on the file name when reading from a zip archive and should rather pass `ZipInfo`. – bereal Mar 03 '14 at 10:57
  • In a similar vein, what is defined as the 'same' is somewhat OS / filesystem specific. Eg. you can have 'file.json' and 'File.json' that end up colliding on OS X with case-insensitive filesystem. Don't get me started on things like 'file.aux' on Windows... – Cameron Kerr Mar 25 '21 at 23:31

1 Answers1

5

This is allowed by Zip and some other archive formats, like Tar, and even addressed by the Python API:

Note: The open(), read() and extract() methods can take a filename or a ZipInfo object. You will appreciate this when trying to read a ZIP file that contains members with duplicate names.

bereal
  • 32,519
  • 6
  • 58
  • 104