2

We have a Percona Xtradb cluster with about 11 nodes. One of the nodes crashed about 2 days ago, but now failing to start again even after the donor indicates that the SST process is complete and the node has now JOINED the cluster.

When i check the log of the crashed node that fails to start, i keep seeing this error repeatedly (in intervals of hours):

[Warning] WSREP: Failed to report last committed [xxxxxx] -4 (Interrupted 
system call)

But before and after this message pops up in the error log once after some hours, the only line being logged is:

....
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping
[Warning] WSREP: Failed to report last committed [xxxxxx] -4 (Interrupted 
system call)
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping
2015-10-19 11:23:48 9091 [Note] WSREP: (f771e66c, 'tcp://0.0.0.0:4567') address 'tcp://192.168.2.100:4567' pointing to uuid f771e66c is blacklisted, skipping

....

What might be causing this to happen? And why won't this node start again? And how can i fix the node, start it, and have it join the cluster again?

The Georgia
  • 1,005
  • 7
  • 23
  • 59

0 Answers0