13

So what's the idea behind a cluster?

  1. You have multiple machines with the same copy of the DB where you spread the read/write? Is this correct?

  2. How does this idea work? When I make a select query the cluster analyzes which server has less read/writes and points my query to that server?

  3. When you should start using a cluster, I know this is a tricky question, but mabe someone can give me an example like, 1 million visits and a 100 million rows DB.

Abimaran Kugathasan
  • 31,165
  • 11
  • 75
  • 105
Uffo
  • 9,628
  • 24
  • 90
  • 154

2 Answers2

11

1) Correct. Every data node does not hold a full copy of the cluster data, but every single bit of data is stored on at least two nodes.

2) Essentially correct. MySQL Cluster supports distributed transactions.

3) When vertical scaling is not possible anymore, and replication becomes impractical :)


As promised, some recommended readings:

RandomSeed
  • 29,301
  • 6
  • 52
  • 87
  • Thank you, what do you exactly say on point 3: `and replication becomes impractical` ?can you give me an example? – Uffo Aug 07 '13 at 06:50
  • Are you asking about when replication becomes impractical, or about replication altogether? Horizontal scaling can be achieved through, for example, [circular replication](http://dev.mysql.com/doc/refman/5.6/en/mysql-cluster-replication-multi-master.html) behind a load balancer. But when you start implementing these things, you should ask yourself about implementing a cluster instead. – RandomSeed Aug 07 '13 at 11:59
  • [This answer](http://stackoverflow.com/a/5326403/1446005) may also be of interest to you. – RandomSeed Aug 07 '13 at 12:02
  • Great infos, defently I have to digg more into this and playaround, can you recoomend me maybe some books or great sites? I will accept the post anyhow :) – Uffo Aug 08 '13 at 08:21
  • I just noticed I previously gave a wrong link about (non-clusterised) circular replication. Please [check this instead](http://www.cwik.ch/2011/03/setting-up-multi-master-circular-replication-with-mysql/). I will add some more links into my answer. – RandomSeed Aug 08 '13 at 08:45
  • I can't provide real-life examples of usage of MySQL cluster, as I have never had the chance to play with it in a production environment (all job offers will be carfully reviewed ;). – RandomSeed Aug 08 '13 at 09:21
  • Thank you, but you may want to leave the bounty pending for a few days, as it might attract better answers. – RandomSeed Aug 08 '13 at 11:12
2

1->your 1st point is correct in a way.But i think if multiple machines would share the same data it would be replication instead of clustering. In clustering the data is divided among the various machines and there is horizontal partitioning means the dividing of the data is based on the rows,the records are divided by using some algorithm among those machines.

the dividing of data is done in such a way that each record will get a unique key just as in case of a key-value pair and each machine also has a unique machine_id related which is used to define which key value pair would go to which machine.

we call each machine a cluster and each cluster consists of an individual mysql-server, individual data and a cluster manager.and also there is a data sharing between all the cluster nodes so that all the data is available to the every node at any time.

the retrieval of data is done through memcached devices/servers for fast retrieval and there is also a replication server for a particular cluster to save the data.

2->yes, there is a possibility because there is a sharing of all the data among all the cluster nodes. and also you can use a load balancer to balance the load.But the idea of load balancer is quiet common because they are being used by most of the servers. but if you are trying you just for your knowledge then there is no need because you will not get to notice the type of load that creates the requirement of a load balancer the cluster manager itself can do the whole thing.

3->RandomSeed is right. you do feel the need of a cluster when your replication becomes impractical means if you are using the master server for writes and slave for reads then at some time when the traffic becomes huge such that the sever would not be able to work smoothly then you will feel the need of clustering. simply to speed up the whole process. this is not the only case, this is just one of the scenario this is only just a case.

hope this is helpful for you!!