1

Lets say I have a counter cache that needs to to be incremented on every page load. Say I have 10 web instances. How do I consistently increase a counter cache column?

Consistency is easy with one web instance. but with several instances running, A race conditions may occur.

Here is a quick explanation. Let's say my counter cache column is called foo_counts and its starting value is 0. If 2 web instance are loaded at the same time, both realize the count as 0. When it comes time to increase the count. They both increment the count from 0 to 1.

I looked at http://guides.rubyonrails.org/active_record_querying.html#locking-records-for-update

Any ideas would be greatly appreciated.

Christian Fazzini
  • 19,613
  • 21
  • 110
  • 215

2 Answers2

6

You could use increment_counter:

increment_counter(counter_name, id)

Increment a number field by one, usually representing a count.

This does a direct UPDATE in SQL so Model.increment_counter(:c, 11) sends this SQL to the database:

update models set c = coalesce(c, 0) + 1 where id = 11

so you don't have to worry about race conditions. Let the database do its job.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • 1
    Hmmm, interesting. So the database does its own kind of locking/queing? I'm just thinking out loud. What if several updates are thrown at the same time using this method. Does the count update consistently? – Christian Fazzini Aug 11 '12 at 02:09
  • 3
    @ChristianFazzini: Yes, RDBMs solved this problem a couple decades ago. – mu is too short Aug 11 '12 at 02:39
0

Consider queueing the increments, and having workers do the actual incrementation in the background. You won't have completely up-to-the-millisecond data, but at least it will be accurate.

x1a4
  • 19,417
  • 5
  • 40
  • 40
  • I thought about this. I can have `rescue ActiveRecord::StaleObjectError`, in the worker, and queue the job again. But this would mean several workers running in the background. And considering doing this for every page load, means a TON of workers – Christian Fazzini Aug 11 '12 at 01:13
  • another potential is to not track it in sql, but to use something like redis with its atomic [incr](http://redis.io/commands/incr) command, which should easily handle almost any write load that you need. – x1a4 Aug 11 '12 at 01:36
  • What tool(s) would you suggest for tracking long-term metrics on a web app? – Christian Fazzini Aug 11 '12 at 09:12
  • Long term, the three things you'll want are high write performance, easy shardability, and the ability to run queries over your dataset to actually get your analytics. SQL works fine in the small, and has great queryability. It isn't so hot in the other two categories. Typically you store immediate data in something with high write performance, like redis or cassandra, then move that data to something that allows map-reduce querying, like hadoop. That kind of setup is almost certainly overkill at the point you're at. Use SQL for now, but have a plan ready to execute when you grow beyond it. – x1a4 Aug 12 '12 at 21:18