2

I've been trying with intermittent success to restore my ndb_backups to a new cluster.

We have a 6 NDB node cluster with 3 API nodes. When I run an ndb_restore, usually the first 2 or 3 node backups get pulled in without issue. But the 4th and 5th node fail with the following error:

Temporary error: 266: Time-out in NDB, probably caused by deadlock 
Temporary error: 266: Time-out in NDB, probably caused by deadlock 
Retried transaction 10 times. 
Last error266: Time-out in NDB, probably caused by deadlock 
...Unable to recover from errors. Exiting... 

Strangely, sometimes I can simply rerun all 6 backups and it finishes successfully.

I'm hoping someone knows what kind of tweaks I can make to my configuration to optimize this process. Here are my version numbers and configs:

mysql-5.6.22 ndb-7.3.8

My mgm configuration file:

###################### 
#MGM CONFIG 
###################### 
[ndbd default] 
# Options affecting ndbd processes on all data nodes: 
NoOfReplicas=2 # Number of replicas 
DataMemory=8144M # How much memory to allocate for data storage 
IndexMemory=8144M # How much memory to allocate for index storage 
# For DataMemory and IndexMemory, we have used the 
# default values. Since the "world" database takes up 
# only about 500KB, this should be more than enough for 
# this example Cluster setup. 

[ndb_mgmd] 
# Management process options: 
hostname=192.168.207.133 # Hostname or IP address of MGM node 
NodeId=1 

[ndb_mgmd] 
# Management process options: 
hostname=192.168.207.45 # Hostname or IP address of MGM node 
NodeId=2 

[ndbd] 
# Options for data node "A": 
hostname=192.168.207.135 # Hostname or IP address 
NodeId=3 

[ndbd] 
# Options for data node "B": 
hostname=192.168.207.171 # Hostname or IP address 
NodeId=4 


[ndbd] 
# Options for data node "C": 
hostname=192.168.207.174 # Hostname or IP address 
NodeId=5 


[ndbd] 
# Options for data node "D": 
hostname=192.168.207.27 # Hostname or IP address 
NodeId=6 


[ndbd] 
# Options for data node "E": 
hostname=192.168.207.169 # Hostname or IP address 
NodeId=7 


[ndbd] 
# Options for data node "F": 
hostname=192.168.207.178 # Hostname or IP address 
NodeId=8 


[mysqld] 
hostname=192.168.207.177 
NodeId=10 

[mysqld] 
hostname=192.168.207.35 
NodeId=11 

[mysqld] 
hostname=192.168.207.148 
NodeId=12 

My mysqld and ndb node config

###################### 
#API AND NDB CONFIG 
###################### 
[mysqld] 
ndbcluster 

[mysql_cluster] 
ndb-connectstring=192.168.207.133,192.168.207.45 # location of management server 

Really hope someone can help. I've been at this for a month. We use data blobs quite extensively and I understand that this can cause these time-outs, but I'm most curious to find out why the 6-node restore sometimes succeeds and sometimes doesn't, and how I can go about ensuring the restore is successful every time.

I'm very open to try things and repost. I'm new to mysql cluster and have learned piles in the past few months, but am eager to learn more.

Thanks in advance, GT

99454
  • 21
  • 1

0 Answers0