0

I have a situation where i want to get a list of all the folders from the registered datastore in azure ML studio. We are able to browse the folders in the data section after selecting the particular datastore but i didn't find anyway to see the list programatically in python. Thanks in advance

I want a iterative list containing the folder names.

3 Answers3

0

I tried in my environment and got the below results:

I want to get a list of all the folders from the registered datastore in Azure ML studio.

Datastores are attached to workspaces and are used to store connection information to Azure storage services In machine learning, the blob container or file share is the datastores. Initially, you can see my data stores it has two folders:

enter image description here

To list the folders only from blob storage(datastores). you can use the azure-storage-blob package and below code :

from azure.storage.blob import BlobServiceClient

connect_str="<Your connection string>"
container_name="your container name(Datastore)"
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client(container_name)
for file in container_client.walk_blobs(delimiter="/"):
  print(file.name)

Output:

The above code is executed successfully, and it returns the folder name only.

folder1/
folder2/

enter image description here

If you need to access the folders with files you can use the below code:

Code:

from azure.storage.blob import BlobServiceClient

connect_str="your connection string"
container_name="containername(datastore)"
blob_service_client = BlobServiceClient.from_connection_string(connect_str)
container_client = blob_service_client.get_container_client(container_name)
for file in container_client.list_blobs():
    print(file.name)

Output:

The above code is executed successfully and returns the folder with the file name.

folder1/28-03-2023.html
folder1/subfolder1/20-03-2023.html
folder2/sas.txt

enter image description here

Venkatesan
  • 3,748
  • 1
  • 3
  • 15
0

I was able to get those values using below code.

import pandas as pd
from azureml.fsspec import AzureMachineLearningFileSystem

subscription_id = '84412ecc5c0d'
resource_group = 'nonprod-RG'
workspace_name = 'platform'
input_datastore_name = 'ids'
target_datastore_name = 'tds'
path_on_datastore = ''

# long-form Datastore uri format:
uri = f'azureml://subscriptions/{subscription_id}/resourcegroups/{resource_group}/workspaces/{workspace_name}/datastores/{input_datastore_name}/paths/{path_on_datastore}'
# instantiate file system using datastore URI
fs = AzureMachineLearningFileSystem(uri)
# list files in the path
f_list = fs.glob()
region_list = []
for f in f_list:
    region_list.append(f.split('/')[1])

Result: Result

0

fs.ls() will return a list of all the contents in your blob container (folder in your storage location). I also recommend to read the docs for any updates: LINK

from azureml.fsspec import AzureMachineLearningFileSystem

# define the URI - update <> placeholders
uri = 'azureml://subscriptions/<subscription_id>/resourcegroups/\
<rg_name>/workspaces/<ws_name>/datastores/workspaceblobstore/paths/<blob_name>/'


# create the filesystem
fs = AzureMachineLearningFileSystem(uri)
fs.ls() 
desertnaut
  • 57,590
  • 26
  • 140
  • 166
Natasha
  • 1
  • 1