0

I have some files inside a container named data:

folder1/somepath/folder2/output/folder3/my_file1.csv
folder1/somepath/folder2/output/folder3/my_file4.csv
folder1/somepath/folder2/output/folder3/my_file23.csv

I have the following code:

file_names_prefix = os.path.join('folder1/somepath/','folder2','output','folder3','my_file')
client = BlobServiceClient('https://mystoragename.blob.core.windows.net',credential=ManagedIdentityCredential()).get_container_client('data')
blob_list = client.list_blobs(name_starts_with=file_names_prefix)
file_list = [blob.name for blob in blob_list]

The code above produces the following output:

['folder1/somepath/folder2/output/folder3/my_file1.csv',
 'folder1/somepath/folder2/output/folder3/my_file4.csv',
'folder1/somepath/folder2/output/folder3/my_file23.csv']

but when trying to delete these files using:

client.delete_blobs(file_list)

There is an error:

TypeError Traceback (most recent call last) /tmp/ipykernel_2376/712121654.py in ----> 1 client.delete_blobs(file_list)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azure/core/tracing/decorator.py in wrapper_use_tracer(*args, **kwargs) 81 span_impl_type = settings.tracing_implementation() 82 if span_impl_type is None: ---> 83 return func(*args, **kwargs) 84 85 # Merge span is parameter is set, but only if no explicit parent are passed

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azure/storage/blob/_container_client.py in delete_blobs(self, *blobs, **kwargs) 1298 return iter(list()) 1299 -> 1300 reqs, options = self._generate_delete_blobs_options(*blobs, **kwargs) 1301 1302 return self._batch_send(*reqs, **options)

/anaconda/envs/azureml_py38/lib/python3.8/site-packages/azure/storage/blob/_container_client.py in _generate_delete_blobs_options(self, *blobs, **kwargs) 1206 req = HttpRequest( 1207 "DELETE", -> 1208 "/{}/{}{}".format(quote(container_name), quote(blob_name, safe='/~'), self._query_str), 1209 headers=header_parameters 1210 )

/anaconda/envs/azureml_py38/lib/python3.8/urllib/parse.py in quote(string, safe, encoding, errors) 817 if errors is not None: 818 raise TypeError("quote() doesn't support 'errors' for bytes") --> 819 return quote_from_bytes(string, safe) 820 821 def quote_plus(string, safe='', encoding=None, errors=None):

/anaconda/envs/azureml_py38/lib/python3.8/urllib/parse.py in quote_from_bytes(bs, safe) 842 """ 843 if not isinstance(bs, (bytes, bytearray)): --> 844 raise TypeError("quote_from_bytes() expected bytes") 845 if not bs: 846 return ''

TypeError: quote_from_bytes() expected bytes

Can someone please help?

Obiii
  • 698
  • 1
  • 6
  • 26

3 Answers3

1

I tried various things, but nothing worked. Ended up deleting files in a loop.

for file in file_list:
    client.delete_blob(file)
Obiii
  • 698
  • 1
  • 6
  • 26
1

See https://github.com/Azure/azure-sdk-for-python/issues/25764. delete_blobs takes *blobs as its first argument. So

client.delete_blobs(*file_list)

should do the trick.

Here are the official docs for reference.

Steven Jin
  • 11
  • 1
0

The error is due to lack of permissions. Azure uses Shared Access Singatures[SAS] tokens and roles to protect the Azure Blob storage objects like containers, and blobs. The above code snippet uses default credentials, which has read and list access to the Blob container that is being used, however that user is not having the correct role to delete the blob. Check Azure documentation to know the RBAC roles that allows blob deletion.

In order to delete a blob, the RBAC action that needs to be present for the role is Microsoft.Storage/storageAccounts/blobServices/containers/blobs/delete.

See Azure documentation for full list of RBAC actions

Refer this SO answer.

Anand Sowmithiran
  • 2,591
  • 2
  • 10
  • 22
  • Hi,I tried with updated roles. I am still unable to delete blob files. I am using managed identity in azure function. Plz see the latest edit. – Obiii May 18 '22 at 12:16
  • This error seems to be faced by others, the fix is to pass the `api version` as additional parameter. see the Github discussion [here](https://github.com/Azure/azure-sdk-for-python/issues/16193). When creating BlobServiceClient, pass argument api_version="2019-12-12", try this and check. – Anand Sowmithiran May 18 '22 at 13:45
  • Hi, Thanks. I have tried this but sadly it does not work. – Obiii May 19 '22 at 07:52
  • Have you updated to the latest version of the azure python sdk? – Anand Sowmithiran May 19 '22 at 10:44