0

I have setup a redis master-slave(s) cluster with sentinel monitoring for HA on linux debian (using stretch backports: redis v4.0.2).

Sentinel is working well as, when I shutdown one of the three nodes, another node is elected as the new master.

Now I try to setup a reconfig script to notify clients of the new master.

I created a readable and executable (chmod a+rx) script in /var/redis/test.sh then I added such a line in my 3 sentinel nodes in /etc/redis/sentinel.conf:

sentinel client-reconfig-script mymaster /var/redis/test.sh

Looking at sentinel config with a sentinel master mymaster command, I can confirm that client-reconfig-script is well configured:

10.2.0.6:26379> sentinel master mymaster
...
43) "client-reconfig-script"
44) "/var/redis/test.sh"

However, when a failover occurs, my reconfig script is not triggered. And I wonder why. Here is the sentinel log:

29765:X 16 Oct 23:03:11.724 # Executing user requested FAILOVER of 'mymaster'
29765:X 16 Oct 23:03:11.724 # +new-epoch 480
29765:X 16 Oct 23:03:11.724 # +try-failover master mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:11.777 # +vote-for-leader 5a0661a5982701465a387b4872cfa4c576edbd38 480
29765:X 16 Oct 23:03:11.777 # +elected-leader master mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:11.777 # +failover-state-select-slave master mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:11.854 # +selected-slave slave 10.2.0.8:6379 10.2.0.8 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:11.854 * +failover-state-send-slaveof-noone slave 10.2.0.8:6379 10.2.0.8 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:11.910 * +failover-state-wait-promotion slave 10.2.0.8:6379 10.2.0.8 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:12.838 # +promoted-slave slave 10.2.0.8:6379 10.2.0.8 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:12.838 # +failover-state-reconf-slaves master mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:12.893 * +slave-reconf-sent slave 10.2.0.6:6379 10.2.0.6 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:13.865 * +slave-reconf-inprog slave 10.2.0.6:6379 10.2.0.6 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:13.865 * +slave-reconf-done slave 10.2.0.6:6379 10.2.0.6 6379 @ mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:13.937 # +failover-end master mymaster 10.2.0.7 6379
29765:X 16 Oct 23:03:13.937 # +switch-master mymaster 10.2.0.7 6379 10.2.0.8 6379
29765:X 16 Oct 23:03:13.937 * +slave slave 10.2.0.6:6379 10.2.0.6 6379 @ mymaster 10.2.0.8 6379
29765:X 16 Oct 23:03:13.937 * +slave slave 10.2.0.7:6379 10.2.0.7 6379 @ mymaster 10.2.0.8 6379

May I have a missing configuration option?

additional information: I installed a similar architecture a few weeks ago (redis 4.0.1) and it worked (I mean it was firing my reconfig script), but I did not keep the configuration, so I may have missed something. Or... could it be a bug introduced in v4.0.2?!

Nicolas Payart
  • 1,046
  • 14
  • 28

3 Answers3

1

The 'chroot-like environment' for me was the systemd setup that comes with the default apt install redis-sentinel.

Changing the options in /etc/systemd/system/sentinel.service

PrivateTmp=no
ReadWriteDirectories=-/tmp

will make writing a test file to /tmp work as expected.

Sending emails from the command line involves switching most of the other options off (or swap it to run as root...)

James West
  • 11
  • 2
0

I finally solved my problem.

The "reconfig.sh" script WAS fired by the failover, but I didn't realize it was because:

  1. sentinel logging (even in debug mode) is not very clear about the reconfig script execution
  2. reconfig script seems to be run in a chroot-like environment that made my tests non-relevant!

Here is the sentinel log when a client-reconfig-script is triggered ("script-child" lines):

32711:X 18 Oct 16:06:42.615 # +failover-state-reconf-slaves master mymaster 10.2.0.6 6379
32711:X 18 Oct 16:06:42.671 * +slave-reconf-sent slave 10.2.0.8:6379 10.2.0.8 6379 @ mymaster 10.2.0.6 6379
32711:X 18 Oct 16:06:42.671 . +script-child 397
32711:X 18 Oct 16:06:42.813 . -script-child 397 0 0

Then my reconfig.sh looked like this:

#!/bin/bash
touch /tmp/reconfig
exit 0

=> Don't expect to find a /tmp/reconfig file when this script is called by Sentinel!

However, I still do not know exactly how it works internally...

Nicolas Payart
  • 1,046
  • 14
  • 28
  • Hi Nicolas, did you manage to get this working in the end? I appear to be having a similar issue where I can run the script myself by sentinel cannot. What did you do to resolve the issue? – Ben Aug 23 '19 at 14:41
0

If run redis as the user 'root', the client-reconfig-script will be triggered .

liuhao
  • 631
  • 5
  • 3