0

I have an existing blob container with over 3 million blobs in it. I have written an Azure Function using BlobTrigger and a Blob output binding to copy the file, including it's tags, to another container on another storage account.

The Azure docs seem to indicate BlobTrigger for a standard Blob Storage container is not recommended or perhaps not supported or possible for "high-scale" containers (containers with over 100,000 blobs in them).

My function is working against this container, but it does take about 9 minutes between startup, when the host lock lease is acquired, until the first files start processing.

The problem is, I need to process the existing files, and none of the other options in that Azure doc seem to support processing of existing blobs.

Do I proceed with the function I have, or should I avoid using it due to it's long start time? Perhaps one of the event based ones are better, but then how do I "catch up" on the existing files first?

Dave Slinn
  • 435
  • 6
  • 13

1 Answers1

1

I do agree with @Peter Bons that for existing files you can use azcopy command with which we can copy files and tags can be preserved too and followed Microsoft-Document :

azcopy copy 'https://mysourceaccount.blob.core.windows.net/mycontainer/myBlobDirectory' 'https://mydestinationaccount.blob.core.windows.net/mycontainer' --recursive

If you want to copy containers, directories, and blobs:

azcopy copy 'https://mysourceaccount.blob.core.windows.net/' 'https://mydestinationaccount.blob.core.windows.net' --recursive
RithwikBojja
  • 5,069
  • 2
  • 3
  • 7
  • Thanks, I'll look at doing a one-time AzCopy to catch up, and invest some time looking at doing an Event Grid subscription for the ongoing replication. – Dave Slinn Jan 03 '23 at 19:12
  • This doesn't seem to retain the tags from the source blobs and the copied ones. Is there a switch that does it? --blob-tags doesn't seem to be the one either. That is if you want to set the tag on the copied blobs, if I'm reading the docs right. EDIT: found the right command line option. It's called --s2s-preserve-blob-tags – Dave Slinn Jan 03 '23 at 19:54