I have directory in AzureML notebook in which I have 300k files and need to list their names. Approach below works but takes 1.5h to execute:
from os import listdir
from os.path import isfile, join
mypath = "./temp/"
docsOnDisk = [f for f in listdir(mypath) if isfile(join(mypath, f))]
What is the azure way to quickly list those files? (both notebook and this directory is in FileShare).
I am also aware that the approach below will give some gain, but still it is not the azure way to do this.
docsOnDisk = [f.name for f in scandir(mypath) ] # shall be 2-20x faster