How to build a simplified redis cluster (support data sharding and load balance)?

Question

Since the redis cluster is still a work in progress, I want to build a simplied one by myselfin the current stage. The system should support data sharding,load balance and master-slave backup. A preliminary plan is as follows:

Master-slave: use multiple master-slave pairs in different locations to enhance the data security. Matsters are responsible for the write operation, while both masters and slaves can provide the read service. Datas are sent to all the masters during one write operation. Use Keepalived between the master and the slave to detect failures and switch master-slave automatically.
Data sharding: write a consistant hash on the client side to support data sharding during write/read in case the memory is not enougth in single machine.
Load balance: use LVS to redirect the read request to the corresponding server for the load balance.

My question is how to combine the LVS and the data sharding together?

For example, because of data sharding, all keys are splited and stored in server A,B and C without overlap. Considering the slave backup and other master-slave pairs, the system will contain 1(A,B,C), 2(A,B,C) , 3(A,B,C) and so on, where each one has three servers. How to configure the LVS to support the redirection in such a situation when a read request comes? Or is there other approachs in redis to achieve the same goal?

Thanks:)

Tommaso Barbugli · Accepted Answer · 2015-09-24T12:37:53.010

2

You can get really close to what you need by using:

twemproxy shard data across multiple redis nodes (it also supports node ejection and connection pooling)

redis slave master/slave replication

redis sentinel to handle master failover

depending on your needs you probably need some script listening to fail overs (see sentinel docs) and clean things up when a master goes down

edited Sep 24 '15 at 12:37

answered Oct 09 '13 at 15:33

Tommaso Barbugli

11,781
2
42
41

yeah, twemproxy seems to be the right one I am seeking. Redis sentinel is still in developling and not mature. I want to deploy the cluster in a production environment, so I need another approach to handle master failover. Your answer helps a lot. Thanks. – Shuai Tao Oct 14 '13 at 02:59
@ShuaiTao redis-sentinel is stable and can be used in production, in fact it is part of redis since 2.6 – Tommaso Barbugli Oct 14 '13 at 10:43
There is a warning in the [doc](http://redis.io/topics/sentinel) which shows redis-sentinel is still a work in progress. So I'm not sure whether I could use it in the production. Is the doc an old version and not updated while redis-sentinel has already been stable since 2.6 in fact? – Shuai Tao Oct 15 '13 at 08:51
@ShuaiTao afaik the version in 2.6 is ready for production (or it would not be distributed as part of it I guess) – Tommaso Barbugli Oct 15 '13 at 11:58
@ShuaiTao, What did you end up using ? I am doing the exact same thing right now and want to evaluate the best approaches. I came across TwemProxy and AdvancedJedis. Would like to know what did you go for and how did it work out for you. – snegi Dec 17 '13 at 20:28
@snegi, Finally in the production, i used twemproxy to support the data sharding and wrote some scripts to handle master failover and data recovery. But twemproxy seems to have a memory leak problem especially when writing data fails. It's also not friendly to the pipeline operation in the client side. Somebody recommended me Tair (https://github.com/alibaba/tair) to deploy the redis cluster. I think I would prefer to try Tair if I have a new chance. – Shuai Tao Dec 19 '13 at 01:40
@ShuaiTao AFAIK twemproxy does not have memory leaks, perhaps you have something misconfigured (have a look here https://github.com/twitter/twemproxy/blob/master/notes/recommendation.md#read-writev-and-mbuf) Regarding Tair I dont see why one would want to use some unknown/obscure software that just a bunch of people uses in production (if not only the authors) when you can pick something like Riak (and similar distributes KV sw) that have way larger adoption. – Tommaso Barbugli Dec 19 '13 at 10:28
@TommasoBarbugli, twemproxy is stable in most cases and can work well in production when used in an appropriate way. But we indeed encountered some problems when using it. The memory allocation of twemproxy will sometimes have an abnormal increase (and will not recover after the operation) when the client sends a large amount of commands and data in a time using pipeline. Besides, twemproxy does not support pipeline with a transaction way in the client side, so it is not convenient to rollback when the pipeline fails in half way. – Shuai Tao Dec 21 '13 at 06:55
@TommasoBarbugli, As for Tair, it's because one of my friends is familiar with it and has used it in production before. I agree that one should pick a more general and well-known solution:) – Shuai Tao Dec 21 '13 at 06:56

How to build a simplified redis cluster (support data sharding and load balance)?

1 Answers1