2

I have this code to unzip a zip file that is encrypted with a password:

import zipfile

def main(pswd):
    file_name = 'somefile.zip'
    with zipfile.ZipFile(file_name) as file:
       return file.extractall(pwd = bytes(pswd, 'utf-8'))
print(main("password"))

It works, but i want that if I give the function a correct password, it extracts it and returns for example "True", or with a wrong password it returns "False". How can I improve my Code?

Kanexxy
  • 17
  • 6

2 Answers2

4

extractall function raises RuntimeError when password is bad, so you can do the following

def main(pswd):
    file_name = "somefile.zip"
    with zipfile.ZipFile(file_name) as file:
        try:
            file.extractall(pwd=bytes(pswd, "utf-8"))
            return True
        except RuntimeError:
            return False
kosciej16
  • 6,294
  • 1
  • 18
  • 29
3

TL;DR: If you want to unzip implement proper exception-handling. To test only the password use ZipFile.testzip() with preset default password.

Improvement advice

I would reconsider the decision to return boolean if password is incorrect.

Caution on boolean error-returns

Usually there is only one reason to return boolean from a method:

  • if the method should test something, like a predicate-function hasError() or isValid(input). More and further reading at the end.

Do it or fail

You want a boolean as return-value so you can react on a failure. There might be logical flaw in design. When we call the method we pass it a expect it to succeed, e.g. unlock the zip file by given password and extract it.

However, if it fails we want to react on the exception. This is done by exception-handling.

Exception handling for zipfile

In the Python documentation: zipfile — Work with ZIP archives there are many reasons to raise exceptions during extract() or extractall(), e.g. ValueError or RuntimeError. One of the causes of those exceptions is bad password, see Decompression Pitfalls:

Decompression may fail due to incorrect password / CRC checksum / ZIP format or unsupported compression method / decryption.

You can catch those during error-handling by using a try-except block in Python.

import zipfile

def is_correct(pswd): # renamed from `main`
    file_name = 'somefile.zip'
    with zipfile.ZipFile(file_name) as file:
        try:
            file.extractall(pwd = bytes(pswd, 'utf-8'))
            return True  # correct password, decrypted and extracted successfully
        except RuntimeError as e:
            if e.args[0].startswith('Bad password for file'):
                return False  # incorrect password
            raise e  # TODO: properly handle exceptions?


print(is_correct("password"))

Testing the password - without extraction

An answer on Stack Exchange site Code Review to python - Zipfile password recovery program suggests:

Use Zipfile.setpassword() with ZipFile.testzip() to check password.

First use .setpassword(pwd) to set a default password to use with this zip file. Then method .testzip() tests if it can open and read the zip file - using this previously set default password:

Read all the files in the archive and check their CRC’s and file headers. Return the name of the first bad file, or else return None.

Again, the reasons for it to raise an exception or return non-None can be more than only bad password.

Here too, a bad password will raise a RuntimeError, so we need exception-handling again:

import zipfile

def is_correct(pswd):
    file_name = 'somefile.zip'
    pwd = bytes(pswd, 'utf-8')
    zip = zipfile.ZipFile(file_name)
    zip.setpassword(pwd)
    try:
        bad_file = zip.testzip() # if anything returned without error then password was correct
        return True  # ignore if bad file was found (integrity violated)
    except RuntimeError as e:
        if e.args[0].startswith('Bad password for file'):
            return False
        raise e  # throw any other exception to print on console 


passwords = ["password", "pass"]
for p in passwords:
   print(f"'{p}' is correct?",is_correct(p))

Prints:

'password' is correct? False

'pass' is correct? True

See also

Working with ZipFile and passwords:

Boolean from a clean-code perspective:

hc_dev
  • 8,389
  • 1
  • 26
  • 38