2

The setup

I have a failover redis setup that consists of three sentinels and two redis servers, which are all on separate boxes.

The setup looks like :

-------------------
| Sentinel1 - AMS |\
------------------- \  ---------------------------
         |           -/| Redis Server1 (M) - FRA |
-------------------  / ---------------------------
| Sentinel2 - FRA |--
-------------------  \ ---------------------------
         |           -\| Redis Server2 (S) - AMS |
------------------- /  ---------------------------
| Sentinel3 - LON |/
-------------------

All sentinels and servers can see each other via VPN.

The configuration for the sentinels is :

# Ansible managed

daemonize yes
pidfile "/var/run/redis/redis-sentinel.pid"
logfile "/var/log/redis/redis-sentinel.log"

# Note the ip changes for each sentinel  - 12,13,14

bind 192.168.1.14
port 26379
dir "/var/lib/redis"

sentinel monitor q-redis-01 192.168.1.10 6379 2
sentinel down-after-milliseconds q-redis-01 10000
sentinel auth-pass q-redis-01 XXX

And the excerpt configuration for the redis servers is :

# Ansible managed

daemonize yes
pidfile "/var/run/redis/redis-server.pid"
port 6379
tcp-backlog 511

# Note the ip changes for each server  - 10, 11
bind 192.168.1.10

timeout 0
tcp-keepalive 0
loglevel notice
logfile "/var/log/redis/redis-server.log"
databases 10

save 900 1
save 300 10
save 60 10000
stop-writes-on-bgsave-error yes
rdbcompression yes
rdbchecksum yes
dbfilename "dump.rdb"
dir "/var/lib/redis"

masterauth "XXX"

slave-serve-stale-data yes
slave-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no

# Note Server 1 has priority 10 and Server 2 has 20
slave-priority 10

requirepass "XXX"

...

As the configuration in Server2 I also have this line :

slaveof 192.168.1.10 6379

The problem

The setup works and when Server 1 is unreachable, Server 2 is promoted to master.

What I want to achieve though is when Server 1 recovers, I want to become the master again automatically.

I need this to happen, because the datacenter of FRA is closer to the rest of the infrastructure and the whole setup is used for failover, not for scalability.

The question

Is it possible to configure the redis sentinels to promote back a recovered master node to be the master in the group automatically?

drinchev
  • 19,201
  • 4
  • 67
  • 93

2 Answers2

2

I also wondered this question, but I don't think sentinels will make it master automatically.

But we can achieve that goal by forcing a failover:

  1. R1(Redis Server1 in your diagram) dies and R2(Redis Server2) is promoted to master.
  2. R1 comes back after recovery, and sentinels will set it as slave to R2.
  3. Execute SENTINEL failover <master name> command to make R1 master again.
whatacold
  • 660
  • 7
  • 15
-1

I think you're going about this the wrong way.

First, I would highly consider having three servers (1 master, 2 slaves) versus your current configuration. Keep in mind that you can run sentinel and caching on the same servers. Therefore, instead of needing 6 servers, you would still only need 3. Take a look at the docs: Example 2: basic setup with three boxes.

Second, I would replicate this 3-server configuration in each datacenter where each datacenter manages it's own replication. This is for a number of reasons: 1) latency between updates. We know that REDIS is considered eventually consistent, but you don't necessary want this much latency. 2) You don't want this much outgoing bandwidth between datacenters. 3) Compliance like GDPR and EU.

Instead, have you app residing in LON pull cache from the LON REDIS instance. Similarly, have your FRA app instance pull from the REDIS instance in FRA.

If you must have the setup you've designed, I would highly recommend you stand up a full REDIS cluster versus just using Sentinel. Or, better yet, just use Microsoft Azure's version of REDIS where it's fully managed for you (and very cheap).

Hope this helps.

a11smiles
  • 1,190
  • 1
  • 9
  • 21
  • 1
    Thanks for the text but this does not response the answer. Regardless of what his infrastructure is like, he needs to promote the original master when reachable. – Sergi Jan 17 '20 at 12:28