2

Azure - How do I increase performance on the same single blob download for 3000 - 18,000 clients all downloading in a 5 minute range? (Can't use CDN because we need the files to be private with SAS).

Requirements:

  • We can't use CDN because the file or "blob" needs to be private. We’ll generate SAS keys on all the simultaneous download requests.
  • The files/blobs will be the encrypted exams uploaded 24 or 48 hours before an Exam start time. -3000 - 18,000 downloads at the same start time in a 5- 10 minute window before the Exam start time.
  • 172 – 1000 blobs. Sizes (53 K Byte – 10 M byte ).
  • We have a web service that verifies the students info, pin, exam date/time are correct. If correct, generates the URI & SAS.
  • Azure site said only 480 Mbit/s for a single blob.
  • But another part of Azure site mentions as high as 20,000 trans/sec @ 20 Mbit/sec.

Ideas?

  • Would snapshot of the blob help?
    • I thought a snapshot is only helpful if you know the source blob is being updated during a download?
  • Would premium help?
    • I read premium just means it’s stored on a SSD for more $) But we need more bandwidth and many clients hitting the same blob.
  • Would creating say 50 copies of the same Exam help?
    • Then rotate each client browser through each copy of the file.

Listed on AZURE FORUMS https://social.msdn.microsoft.com/Forums/azure/en-US/7e5e4739-b7e8-43a9-b6b7-daaea8a0ae40/how-do-i-increase-performance-on-the-same-single-blob-download-for-3000-18000-clients-all?forum=windowsazuredata

3 Answers3

1

I would cache the blobs in memory using a Redis Cache instead of using the blobs as the source. In Azure you can launch a Redis Cache of the appropriate size for your volume. Then you are not limited by the blob service.

When the first file is requested
1. check the Redis-cache for the file.
a.Found - Serve the file from the cache.
b.Not Found - Get the file from the blob and put in the cache. Serve the file.

Next request will use the file from the cache, freeing up the azure blob storage.

This is better than duplicating the file on blob storage since you can set an expire time in the Redis cache and it will clean itself up.

https://azure.microsoft.com/en-us/documentation/articles/cache-configure/

greg_diesel
  • 2,955
  • 1
  • 15
  • 24
  • How would you serve the file from cache? Via a client request to a Web Role? If yes, then why put the file in cache to begin with, why not just put the file in local storage on the web role? Cache is not really intended for 10 MB objects and you will end up getting serialized and perf will suffer. – kwill Jun 30 '15 at 20:40
  • @kwill My suggestion is to keep the files in memory and not use the disk on either azure blob storage or a worker role. I imagine the worker roles would have disk limitations as well that the questioner would encounter. – greg_diesel Jun 30 '15 at 20:47
  • ok, then you may want to revise your answer to say in memory rather than redis cache, because the Azure redis cache is not in memory, it is a remote redis cache server. – kwill Jun 30 '15 at 20:52
  • hmm, when I get time at work I'll try a test with your link for Redis-cache and see if it works. – JustAGuy Employee Jul 06 '15 at 14:15
0

Duplication. Rather than rotating though, give the client a list and have them pick randomly. That will also let them fall back to another server if the first request fails.

Jason Coyne
  • 6,509
  • 8
  • 40
  • 70
0

You can use SAS keys with the CDN, assuming that you will be using the same SAS key for all users and that you aren't generating a unique SAS for each user. If you are expecting the users to come within a 5-10 minute window then you could generate a single 15 minute SAS and use that with the CDN. Just make sure you also set the cache TTL on the blob to the same duration that the SAS specifies because the CDN won't actually validate the SAS permissions (blob storage will validate it any time the CDN has to fetch the object from origin). See Using Azure CDN with Shared Access Signatures for more information.

Jason's suggestion of using multiple blobs is also a good solution.

You could also spin up several Webrole instances and host the file locally on those instances, then instead of sending users a SAS URL (which could be used by non-authorized users) you could actually authenticate the user and serve the file directly from the Webrole. If all of your traffic will be within a ~10 minute window then you could spin up hundreds of instances and still keep the cost very low.

Community
  • 1
  • 1
kwill
  • 10,867
  • 1
  • 28
  • 26
  • Thanks, lots of ideas to try. We need the CDN file expire in 10 hours? I saw they default expire in 7 day. Also some MSDN page said the CDN file takes a long time to delete. I tried CDN with SAS but the security failed because I was able to download the file without the SAS key. Or was I wrong in my CDN security setup? We need to lock the file in CDN. Only allowing someone to download if they have a valid SAS. – JustAGuy Employee Jul 06 '15 at 14:16
  • You can set the CDN expiration to however long you want by using the cache control headers, and once it expires it will be immediate (not sure what you mean by it taking a long time to delete). Make sure you enable Query Strings in the CDN options, otherwise you will get the behavior you saw where you could access the file without using the SAS. – kwill Jul 06 '15 at 20:43