1

I am trying to unzip password protected file which is stored on Azure Blob container. I want to extract it on Azure Blob itself. I have created a Azure Function App using Python (currently it is Timer Control event based) to test things -

Following is my code - I am not sure what would be the correct way to achieve this

import datetime
import os, uuid
import azure.functions as func
import azure.storage.blob
from zipfile import ZipFile
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient

def test_func():
    #get connection string to storage account
    connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')

    # BlobServiceClient object which will be used refer to a container client
    blob_service_client = BlobServiceClient.from_connection_string(connect_str)

    # Create a unique name for the container
    container_name = "zipextract"
    container_client =  blob_service_client.get_container_client(container_name)

    blob_list = container_client.list_blobs()

    for blob in blob_list:
        print("----> " + blob.name)
        with ZipFile(blob.name) as zf:
            zf.extractall(pwd=b'password')   

Now, when I am trying to access file using ZipFile() function - it says "No such file or directory: 'TestZip.zip' ".

Following is the error message - (TestZip.zip is the zip file placed on zipextract container)

Result: Failure Exception: FileNotFoundError: [Errno 2] No such file or 
directory: 'TestZip.zip' Stack: File "/azure-functions- 
host/workers/python/3.8/LINUX/X64/azure_functions_worker/dispatcher.py", line 
271, in _handle__function_load_request func = loader.load_function( File 
"/azure-functions- 
host/workers/python/3.8/LINUX/X64/azure_functions_worker/utils/wrappers.py", 
line 32, in call return func(*args, **kwargs) File "/azure-functions- 
host/workers/python/3.8/LINUX/X64/azure_functions_worker/loader.py", line 76, 
in load_function mod = importlib.import_module(fullmodname) File 
"/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module 
return _bootstrap._gcd_import(name[level:], package, level) File "<frozen 
importlib._bootstrap>", line 1014, in _gcd_import File "<frozen 
importlib._bootstrap>", line 991, in _find_and_load File "<frozen 
importlib._bootstrap>", line 961, in _find_and_load_unlocked File "<frozen 
importlib._bootstrap>", line 219, in _call_with_frames_removed File "<frozen 
importlib._bootstrap>", line 1014, in _gcd_import File "<frozen 
importlib._bootstrap>", line 991, in _find_and_load File "<frozen 
importlib._bootstrap>", line 975, in _find_and_load_unlocked File "<frozen 
importlib._bootstrap>", line 671, in _load_unlocked File "<frozen 
importlib._bootstrap_external>", line 783, in exec_module File "<frozen 
importlib._bootstrap>", line 219, in _call_with_frames_removed File 
"/home/site/wwwroot/TimerTrigger1/__init__.py", line 39, in <module> 
test_func() File "/home/site/wwwroot/TimerTrigger1/__init__.py", line 33, in 
test_func with ZipFile(blob.name) as zf: File 
"/usr/local/lib/python3.8/zipfile.py", line 1251, in __init__ self.fp = 
io.open(file, filemode)

Any help as to how would I unzip this file? The unzipping is working fine on local machine, however, not sure how would I make it run so that it refers to blob instead of local file.

Thank you.

Rameshwar Pawale
  • 632
  • 3
  • 17
  • 35

2 Answers2

1

The zip files are stored in Azure blob storage server. It is a remote server. We cannot use ZipFile(blob.name) to access it. We need to read the zip file's content at first then unzip it.

For example

blob_service_client = BlobServiceClient.from_connection_string(conn_str)
container_client = blob_service_client.get_container_client('input')
blob_client = container_client.get_blob_client('sampleData.zip')
des_container_client = blob_service_client.get_container_client('output')
with io.BytesIO() as b:
    download_stream = blob_client.download_blob(0)
    download_stream.readinto(b)
    with zipfile.ZipFile(b, compression=zipfile.ZIP_LZMA) as z:
        for filename in z.namelist():
            if not filename.endswith('/'):
                print(filename)
                with z.open(filename, mode='r', pwd=b'') as f:
                    des_container_client.get_blob_client(
                        filename).upload_blob(f)
Jim Xu
  • 21,610
  • 2
  • 19
  • 39
0

ZipFile(blob.name) is looking for the file in local file system.

You need to download the blob to local file system.

# Download blob to local file
local_file_path = os.path.join(".", blob.name)
with open(local_file_path, "wb") as download_file:
    blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob.name)
    download_file.write(blob_client.download_blob().readall())

# read local zip file
with ZipFile(local_file_path) as zf:
    zf.extractall(pwd=b'password')

Kashyap
  • 15,354
  • 13
  • 64
  • 103
  • I do not want to download it to local system, can it not be done in blob itself? – Rameshwar Pawale Jan 08 '21 at 05:35
  • @RameshwarPawale not using ZipFile interface, no. Only possibility would be to read the zip file into memory as bytes and then unzip it (in memory) and write the unzipped contents to ADLS, if you don't want-to/cannot create a local file. There is no library that I know of that reads zip file from ADLS, unzips it and writes unzipped contents to ADLS. – Kashyap Jan 08 '21 at 16:21