5

I would like to know if there is a way through which I can auto-sync an Azure blob to the local file system, so that whenever a blob gets added to a container an event is fired so it can be downloaded to a local folder.

So far I can sync from the local file system to Azure blobs, but not the other way.

If I use polling, how frequently should I poll and how much does it affect performance?

InteXX
  • 6,135
  • 6
  • 43
  • 80
Vivek Misra
  • 265
  • 4
  • 13

3 Answers3

5

This is something you'd need to manage yourself, as there's no way to watch a blob or container. That said: you can check a container's (or blob's) eTag to see if content has been updated.

As for polling: each time you do a GET, you'll incur a transaction hit. Not that it's very costly (a half-penny per 100K transactions): If you polled once every second, you'd spent maybe 15 cents (per role instance doing the polling) monthly. However: I think that's a bit aggressive, especially if you have other storage-related activities happening simultaneously (queue-polling, etc.).

This comes down to how often you think blob content will be updated, and how current your local cache needs to be.

You can also consider building an uploader service that takes an uploaded object, stores it in blob storage, and notifies all running instances that there's an update (including URL to the blob updated). Maybe use a service bus pub/sub for this? I don't know about your app's architecture and how content is uploaded; this may or may not work for you.

InteXX
  • 6,135
  • 6
  • 43
  • 80
David Makogon
  • 69,407
  • 21
  • 141
  • 189
  • Hi David,Thanks for your response.I am trying implement feature which allows both way sync between local file system and blob(like gadient tool)..Having windows service monitoring local file system and update blob whenever change occurs,i was able to implement without much effort..Now i am stuck sync from Azure blob to local file system..I came cross Microsoft sync framework..Any input on that???I do not have knowledge on service bus so do not know how can i utlize it. – Vivek Misra Sep 08 '12 at 16:36
  • Check out https://github.com/brentrossen/AzureBlobContainerSync, it can handle download synchronization for you. Or you can take a look into the code to see how it does etag checking. – Brent Aug 07 '14 at 19:57
  • @Brent - you're correct in that you can build a polling mechanism, such as the one you (?) built (note: If you post your own code, you should explicitly mention that). However, your solution is a polling solution (which may not work too well with millions of blobs and containers), and as I originally stated, there's no built-in watcher mechanism. That is: You cannot subscribe to an even that informs you of a blob (or rather its eTag) changing. – David Makogon Aug 08 '14 at 02:52
  • Ah, yes, I should have mentioned that this is my code. I just wanted to provide a sample of the polling mechanism you mentioned. It wouldn't be effective for millions of files. If the scale is millions of files, and the machine is within an Azure data center, it may be more effective to use the new File service and mount a file share using SMB --http://blogs.msdn.com/b/windowsazurestorage/archive/2014/05/12/introducing-microsoft-azure-file-service.aspx. But the effectiveness of this depends on a lot of factors that aren't mentioned in the original question. – Brent Aug 09 '14 at 10:29
3

I have code that does this. See https://github.com/smarx/noderole/blob/master/WebRole/Sync/OneWayBlobSync.cs.

user94559
  • 59,196
  • 6
  • 103
  • 103
  • Hi Smarx,I had look yesterday but i was just concerned about how it would perform on Production.Did you see any performance issue with this code so that i can look into those issues.Thanks for your response. – Vivek Misra Sep 08 '12 at 18:28
  • I imagine if you had a large enough blob container, performance would start to suffer, since it has to fetch the list of all the blobs in the container every interval, and a list of local files (with associated ETags) is kept in memory. But I would guess you could have thousands of blobs and poll once every five seconds without breaking a sweat. Have you tried it? – user94559 Sep 08 '12 at 22:14
3

have you looked at this? Synchronizing Files to Windows Azure Storage

it's a bit outdated but maybe a good starting point.

JuneT
  • 7,840
  • 2
  • 15
  • 14