I wrote a Azure function in python to unzip a file and upload it back to blob storage. It works great for small files but fails when I try gb files.
I know the issue is that the file is too large to put into memory and needs to be streamed. Any suggestion on how to stream the unzip and upload of this file
My code:
import azure.functions as func
import json
import logging
import os
from zipfile import ZipFile
from azure.storage.blob import BlobServiceClient, ContainerClient, blob
def main(mytimer: func.TimerRequest) -> None:
source_conn_str = xxx
source_container = xxx
blob_service_client_origin = BlobServiceClient.from_connection_string(source_conn_str)
source_fileName = xxx
blob_to_copy = blob_service_client_origin.get_blob_client(container=source_container, blob=source_fileName)
# Step 2. Download zip file to local tmp directory
os.chdir('/tmp/')
print("Downloading file")
blob_data = blob_to_copy.download_blob()
data = blob_data.readall()
print("Download complete")
# Step 3. Save zip file to temp directory
local_filepath = xxx
with open(local_filepath, "wb") as file:
file.write(data)
# Step 3. Unzip file to local tmp directory
with ZipFile(local_filepath, 'r') as zipObj:
zipObj.extractall()
# Step 4. Upload file to storage account
dest_conn_str = xxx
blob_service_client = BlobServiceClient.from_connection_string(dest_conn_str)
container_name = xxx
#Set the local file name
local_file_name = xxx
blob_client = blob_service_client.get_blob_client(container=container_name, blob=local_file_name)
# Upload the file to blob storage
print('Uploading file')
with open(local_file_name, "rb") as data:
blob_client.upload_blob(data, overwrite = True)
print('File Upload complete')`enter code here`
When I run the Azure Function it returns: Exception message: python exited with code 137. Which means it's out of memory. Any suggestions are extremely appreciated.