1

I have 2 proxysql(2.0.13) + keepalived in front of a 3-node galera-cluster, and 2 of the galera-nodes appeaers as "SHUNNED" in my hostgroup 10. I do not know why they appear as "SHUNNED" and they are never recovered.

So i was hoping that somebody here can help me bring the shunned servers back in the loop.

What i've tried: - Set the var mysql-shun_on_failures = 5000 & mysql-shun_recovery_time_sec = 10 - reboot the proxysql nodes

I have made connections to the hostgroup 10, and did some queries, to trigger a recovery, but they are still shunned.

Here's some info from proxysql

Servers:
Admin> select * from runtime_mysql_servers;
+--------------+-----------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| hostgroup_id | hostname  | port | gtid_port | status  | weight | compression | max_connections | max_replication_lag | use_ssl | max_latency_ms | comment |
+--------------+-----------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+
| 10           | 10.4.4.21 | 3306 | 0         | SHUNNED | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 20           | 10.4.4.21 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 20           | 10.4.4.22 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 30           | 10.4.4.23 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 30           | 10.4.4.22 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 30           | 10.4.4.21 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 10           | 10.4.4.23 | 3306 | 0         | ONLINE  | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
| 10           | 10.4.4.22 | 3306 | 0         | SHUNNED | 1      | 0           | 1000            | 0                   | 0       | 0              |         |
+--------------+-----------+------+-----------+---------+--------+-------------+-----------------+---------------------+---------+----------------+---------+


Hostgroups:
Admin> select * from runtime_mysql_galera_hostgroups;
+------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+--------------------+
| writer_hostgroup | backup_writer_hostgroup | reader_hostgroup | offline_hostgroup | active | max_writers | writer_is_also_reader | max_transactions_behind | comment            |
+------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+--------------------+
| 10               | 20                      | 30               | 9999              | 1      | 1           | 1                     | 30                      | galera-hostgroup-1 |
+------------------+-------------------------+------------------+-------------------+--------+-------------+-----------------------+-------------------------+--------------------+

Status:
Admin> select hostgroup,srv_host,status,ConnUsed,MaxConnUsed,Queries,Latency_us from stats.stats_mysql_connection_pool order by srv_host;
+-----------+-----------+---------+----------+-------------+---------+------------+
| hostgroup | srv_host  | status  | ConnUsed | MaxConnUsed | Queries | Latency_us |
+-----------+-----------+---------+----------+-------------+---------+------------+
| 10        | 10.4.4.21 | SHUNNED | 0        | 0           | 0       | 289        |
| 30        | 10.4.4.21 | ONLINE  | 0        | 0           | 0       | 289        |
| 20        | 10.4.4.21 | ONLINE  | 0        | 0           | 0       | 289        |
| 10        | 10.4.4.22 | SHUNNED | 0        | 0           | 0       | 281        |
| 30        | 10.4.4.22 | ONLINE  | 0        | 0           | 0       | 281        |
| 20        | 10.4.4.22 | ONLINE  | 0        | 0           | 0       | 281        |
| 10        | 10.4.4.23 | ONLINE  | 1        | 1           | 106     | 277        |
| 30        | 10.4.4.23 | ONLINE  | 0        | 0           | 0       | 277        |
+-----------+-----------+---------+----------+-------------+---------+------------+

My ping-log:
Admin> select * from monitor.mysql_server_ping_log order by time_start_us DESC limit 0,10;
+-----------+------+------------------+----------------------+------------+
| hostname  | port | time_start_us    | ping_success_time_us | ping_error |
+-----------+------+------------------+----------------------+------------+
| 10.4.4.23 | 3306 | 1597916943398296 | 257                  | NULL       |
| 10.4.4.21 | 3306 | 1597916943298413 | 232                  | NULL       |
| 10.4.4.22 | 3306 | 1597916943198499 | 276                  | NULL       |
| 10.4.4.22 | 3306 | 1597916935355114 | 220                  | NULL       |
| 10.4.4.23 | 3306 | 1597916935277079 | 259                  | NULL       |
| 10.4.4.21 | 3306 | 1597916935199034 | 232                  | NULL       |
| 10.4.4.22 | 3306 | 1597916927359456 | 175                  | NULL       |
| 10.4.4.21 | 3306 | 1597916927278902 | 242                  | NULL       |
| 10.4.4.23 | 3306 | 1597916927198335 | 302                  | NULL       |
| 10.4.4.21 | 3306 | 1597916919324515 | 189                  | NULL       |
+-----------+------+------------------+----------------------+------------+

Thank you all..

gwar
  • 21
  • 1
  • 4

2 Answers2

1

I found the reason why they are marked as "shunned":

My hostgroup-configuration has the max_writers = 1 - which means that ONLY one of my 3 writers (in my writer hostgroup) should be available at the time, and the 2 others should be standing by, so the can take over if the current writer should fail.

It is explained very well here - look under "ProxySQL 2.x Support for Galera Cluster" https://severalnines.com/database-blog/how-run-and-configure-proxysql-20-mysql-galera-cluster-docker.

gwar
  • 21
  • 1
  • 4
1

We have a similar issue where an instance in one infrequently used pool gets shunned, but passes all its checks. The only thing I've found to bring it back online is to use load mysql servers to runtime on one of the ProxySQL cluster nodes.

Wandering Zombie
  • 1,101
  • 13
  • 14