11

I have heard the 'shard' technique mentioned several times with regard to solving scaling problems for large websites. What is this 'shard' technique and why is it so good?

surfmuggle
  • 5,527
  • 7
  • 48
  • 77
Phil Wright
  • 22,580
  • 14
  • 83
  • 137

3 Answers3

9

Karl Seguin has a good blog post about sharding.

From the post:

Sharding is the separation of your data across multiple servers. How you separate your data is up to you, but generally it’s done on some fundamental identifier.

Oded
  • 489,969
  • 99
  • 883
  • 1,009
2

In brief, imagine seperating your users_tbl across several servers. So Users 1-5000 and on Server 1, Users 5000-10000 on Server 2; etc. If your data model is sufficiently abstract in code, it's often not a huge change in code.

Of course this approach becomes difficult if all your queries are similar to "SELECT COUNT(*) FROM users_tbl GROUP BY userType" but when your where is "WHERE userid = 5" then it makes more sense.

Tom Ritter
  • 99,986
  • 30
  • 138
  • 174
2

As 'sharding' is part of the architecture principles for large websites, you may be interested in listening to 'eBay's Architecture Principles with Randy Shoup' here.

Ryan Spears
  • 2,963
  • 2
  • 31
  • 39