How to build a highly scaleable global counter in Azure?

Question

I am trying to setup in Windows Azure a global counter which would keep track of the number of games started within a day. Each time a player starts a game, a Web Service call is made from the client to the server and a global counter would be incremented by one. This should be fairly simple to do with a database... But I wonder how I could efficiently do this. The database approach is good for a few hundreds clients simultaneously, but what will happen if I have 100,000 clients?

Thanks for your help/ideas!

Is the counter result required to be presented in live ? If not, you can log it in file log, then parse it by scheduled job. — Raptor, Nov 27 '12 at 06:04
i'm still thinking of file-based approach. How about adding 1 byte to a file for 1 new game. the counter value is the file size. The log resets every day. File I/O is faster than DB I/O obviously. — Raptor, Nov 27 '12 at 06:17
@ShivanRaptor too clever by half. If the entire population of China played the game then you'd be storing a 1 billion byte file, instead of a simple 32 bit / 4 byte integer. Even if you meant *bit* rather than *byte*, that's still over 100MB. — Kirk Broadhurst, Nov 27 '12 at 07:05
OP mention there are 100,000 clients only. Simple way to deal with the over-size problem is to create a new when the old one almost reaches the limit. You can make use of log4net or similar logging service, or make on your own. — Raptor, Nov 27 '12 at 07:24

score 6 · Accepted Answer · answered Nov 27 '12 at 08:03

A little over a year ago, this was a topic in a Cloud Cover episode: Cloud Cover Episode 43 - Scalable Counters with Windows Azure. They discussed how to create an Apaythy Button (similar to the Like Button on Facebook).

Steve Marx also discusses this in detail in a blog post with source code: Architecting Scalable Counters with Windows Azure. In this solution they're doing the following:

On each instance, keep track of a local counter
Use Interlock.Increment to modify the local counter
If the counter changed, save the new value in table storage (have a timer do this every few seconds). For each deployment/instance, you'll have 1 record in the counters table.
To display the total count, take the sum of all records in the counters table.

score 0 · Answer 2 · answered Nov 27 '12 at 06:26

Well, there are a bunch of choices. And I don't know which is best for you. But I'll present them here with some pros and cons and you can come to your own conclusions given your requirements.

The simplest answer is "put it in storage." Both SQL Azure and the core Azure table or blog storage options are out there for you. One issue to contend with is performance in the face of large scale concurrency, but I'd also encourage you to think about correctness. You really want something that supports atomic increment to outsource this problem IMO.

Another variation of a storage oriented option would be a highly available VM. You could spin up your own VM on Azure, back a data drive on to Azure Drives, and then use something on top of the OS to do this (a database server, an app that uses the file system directly, whatever). This would be more similar to what you'd do at home but would have fairly unfortunate trade-offs...your entire cloud is now reliant on the availability of this one VM, cost is something to think about, scalability of the solution, and so on. Splunk is also an option to consider, if you look at VMs.

As an earlier commenter mentioned, you could compute off of log data. But this would likely not be super real time.

Service Bus is another option to consider. You could pump messages over SB for these events and have a consumer that reads them and emits a "summary." There are a bunch of design patterns to consider if you look at this. The SB stack is pretty well documented. Another interesting element of SB is that you might be able to trade off 100% correctness for perf/scale/cost. This might be a worthy trade-off for you depending upon your goals.

Azure also exposes queues which might be a fit. I'll admit I think SB is probably a better fit but it is worth looking at both if you are going down this path.

Sorry I don't have a silver bullet but I hope this helps.

score 0 · Answer 3 · answered Nov 27 '12 at 06:46

0

I would suggest you follow the pattern described in .NET Multi-Tier Application. This would help you decouple the Web role which faces your clients and the Worker role, which will store the data to a persistence medium (either SQL Server / Azure Storage) by using the Service Bus.

Also, this is an efficient model to scale as you can span new instances of web role or worker role or both. For the dashboard depending on the load you can Cache your data periodically and server it from the Cache. This would compromise on the accuracy of the data, but would still provide with an option for easy scaling. You can even invalidate the cache every 1 minute and get it loaded from the persistence medium to get the latest value.

Regarding to use SQL Server or Azure storage, if there is no need for relational capabilities like JOINS etc, you can very well go for the Azure storage.

answered Nov 27 '12 at 06:46

Ramesh

13,043
3
52
88

Unfortunately this answer misses the core issues. The real problem is concurrency/correctness and performance of the core counter itself. Yes you can put an autoinc in a row somewhere in a sql database in azure. But how does that scale? How many ticks/sec can it support? What is the cost of supporting this? What does the availability look like? These are the real trade-offs you need to deal with. And cache doesn't help you. Cache helps for read, but the counter will see many orders of magnitude more writes than reads. And cache reads will be dirty. How does this help? It is a liability. – Eric Fleischman Nov 27 '12 at 08:23
Hi Eric, I think this addresses the concurrency part (using SB) and scalability part, but will not work if correctness is needed. Regarding the correctness, I think this could be the reason why Youtube/FB rounds the views count / like count to closest whole numbers like 100s / 1K etc. Also, as I have mentioned in my answer the correctness in reading from cache is not going to be present as it is a trade off, but the System correctly keeps track of count without loosing them. – Ramesh Nov 27 '12 at 09:02

How to build a highly scaleable global counter in Azure?

3 Answers3