0

I have a use case which often requires to copy a blob (file) from one Azure region to another. The file size spans from 25 to 45GB. Needless to say, this sometimes goes very slowly, with inconsistent performance. This might take up to two hours, sometimes more. Distance plays a role, but it differs. Even within the same region copying is slower then I would expect. I've been trying:

  1. The Python SDK, and its copy blob method from the blob service.
  2. The rest API copy blob
  3. az copy from the CLI.

Although I didn't really expect different results, since all of them use the same backend methods.

Is there any approach I am missing? Is there any way to speed up this process, or any kind of blob sharing integrated in Azure? VHD/disk sharing could also do.

Aleksandar Stojadinovic
  • 4,851
  • 1
  • 34
  • 56

2 Answers2

0

You may want to try /SyncCopy option in AzCopy:

Synchronously copy blobs from one storage account to another

AzCopy by default copies data between two storage endpoints asynchronously. Therefore, the copy operation runs in the background using spare bandwidth capacity that has no SLA in terms of how fast a blob is copied, and AzCopy periodically checks the copy status until the copying is completed or failed.

The /SyncCopy option ensures that the copy operation gets consistent speed. AzCopy performs the synchronous copy by downloading the blobs to copy from the specified source to local memory, and then uploading them to the Blob storage destination.

AzCopy /Source:https://myaccount1.blob.core.windows.net/myContainer/ /Dest:https://myaccount2.blob.core.windows.net/myContainer/ /SourceKey:key1 /DestKey:key2 /Pattern:ab /SyncCopy

/SyncCopy might generate additional egress cost compared to asynchronous copy, the recommended approach is to use this option in an Azure VM that is in the same region as your source storage account to avoid egress cost.

Community
  • 1
  • 1
Zhaoxing Lu
  • 6,319
  • 18
  • 41
  • Ok, I'll try that out, although I've seen that and didn't correlate it to better performance. If I understand well, the blob will be downloaded to the invoking machine, and then re-uploaded to another location? Shouldn't that be slower? – Aleksandar Stojadinovic Nov 13 '18 at 08:56
  • It can have consistent speed, either slower or faster is possible. The recommended approach is to use this option in an Azure VM that is in the same region as your source storage account. – Zhaoxing Lu Nov 13 '18 at 11:02
0

In linux you can try using --parallel-level option. Try looking it up using azcopy --help. Also, the max op limit is 512 officially. Go bonkers!

Achyut Sarma
  • 119
  • 7