0

I have been tasked with taking different .txt files and converting them to .Z compressed files. We use python 3.11 to do our automation. I used the following code to create a simple .Z file using zlib. It creates the file and when I use python I can read it. However when I send the finished .Z file to someone else who uses a application like WinZip or 7-Zip to decompress and extract the data, the app "cannot open the archive" is generally the response. Is there something that I am missing when writing the compressed data to a .Z shareable file?

'test_file.txt' - blah blah blah

import zlib
import os, sys

#file to compress
filename_in = r'C:\Users\JC\Documents\Test8\test_file.txt'

#output file
filename_out = r'C:\Users\JC\Documents\Test_8\test_output.Z.'

with open(filename_in, mode="r") as fin, open(filename_out, mode="wb") as fout:
    data = fin.read()
    compressed_data = zlib.compress(bytearray(data,'utf-8'), zlib.Z_BEST_COMPRESSION)

    fout.write(compressed_data)
  • You're compressing the data itself of the text file, but you're not compressing the file itself. I would think that that would mean that the compressed data is just data and contains no info about file systems or anything - what do you expect to see when you open the archive, when you haven't actually put any files into it? What would you expect to be extracted from it? This code would work if you were just directly reading and decoding the data right back into a string from within a script, but if you expect a file to be contained in the archive, you have to compress a file, not just data – Random Davis Dec 06 '22 at 18:01
  • When the archive is open people will be expecting to the original txt file. I see your point. Do you know what would be the python steps to compress the text file into a z file? Thank you. – user20705383 Dec 06 '22 at 18:09
  • Yeah the issue is that there's no concept of a "file" that you've compressed. There's no file name. You've just compressed some *data* that *happened* to be in a file. Anyway, I'd think this would be like `.gz` files - you just first contain everything you want to compress in a `.tar` file, then compress that. If you decompress it as data directly, you will then have access to the file system info that's contained in the `.tar` file, I would guess. – Random Davis Dec 06 '22 at 18:17
  • 1
    Who gave you this task? What exactly are they expecting that ".Z" means? The original meaning of ".Z" is a file that is compressed by the Unix `compress` command. zlib cannot produce that. – Mark Adler Dec 07 '22 at 04:22

1 Answers1

1

Your code is fine, but as Mark noted in his comment, the compression library you've used (zlib) does not match the file extension (.Z)

I doubt anyone really needs to perform "Lempel-Ziv-Welch 84" (.Z) compression in Python. That algorithm is now incredibly old and has been long superseded. But if you really must, the approach I would recommend is to simply call the Linux compress command. If you are only running MS Windows you could install Windows Subsystem for Linux (wsl). Since the patents for LZW84 have now lapsed, the compress command should be available for any Linux distribution.

Given that the people receiving the compressed files are trying to open them with WinZip or 7Zip, your best bet is to compress them using Python's bundled ZipFile class.

Your example code modified and simplified to use Python3 ZipFile to create zip format archives is below:

from zipfile import ZipFile

# file to compress
filename_in = r'C:\Users\JC\Documents\Test8\test_file.txt'

# output file
filename_out = r'C:\Users\JC\Documents\Test_8\test_output.zip'

with ZipFile(filename_out, "w") as new_zip:
    new_zip.write(filename_in)
gb96
  • 1,674
  • 1
  • 18
  • 26