TKLBAM backup stalling causing MySQL issues

Question

I'm having a strange issue that I've never encountered before. I'm running a Turnkey Linux LAMP server (Debian) and it seems that my MySQL server becomes inaccessible at least once a day. I'm not sure what's causing it at all. My last few logs before I restart it are:

160108  0:54:09 [Note] Plugin 'FEDERATED' is disabled.
160108  0:54:09 InnoDB: The InnoDB memory heap is disabled
160108  0:54:09 InnoDB: Mutexes and rw_locks use GCC atomic builtins
160108  0:54:09 InnoDB: Compressed tables use zlib 1.2.8
160108  0:54:09 InnoDB: Using Linux native AIO
160108  0:54:09 InnoDB: Initializing buffer pool, size = 128.0M
160108  0:54:09 InnoDB: Completed initialization of buffer pool
160108  0:54:09 InnoDB: highest supported file format is Barracuda.
160108  0:54:09  InnoDB: Waiting for the background threads to start
160108  0:54:10 InnoDB: 5.5.46 started; log sequence number 111777334
160108  0:54:11 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.46-0+deb8u1'  socket: '/var/run/mysqld/mysqld.sock'  port: 0  (Debian)

I haven't changed any of the default settings that I can recall so it should be listening on 3306. I have a few Wordpress sites running on the server so having the DB go down on a whim is sort of bad news. It comes right back up when I restart with no problems and says it's listening on 3306:

160108 10:20:45 [Note] Server socket created on IP: '127.0.0.1'.
160108 10:20:45 [Note] Event Scheduler: Loaded 0 events
160108 10:20:45 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.5.46-0+deb8u1'  socket: '/var/run/mysqld/mysqld.sock'  port: 3306  (Debian)

Any ideas? Thanks!

UPDATE: Here is my full log file: http://pastebin.com/2G2CAVsw

THE PROBLEM: It appears that tklbam-restore causing the problem. I manually ran a backup and noticed that as soon as it got to the DB phase of the process, my Wordpress servers could no longer access MySQL. Also, the backup process seems to be stuck at one of my DB tables. Here's the last few lines:

table: trendsandteens/wp_wfNet404s
table: trendsandteens/wp_wfReverseCache
table: trendsandteens/wp_wfScanners
table: trendsandteens/wp_wfStatus
table: trendsandteens/wp_wfThrottleLog
table: trendsandteens/wp_wfVulnScanners

It's just backing up Wordfence's tables. So I'm not really sure what the issue is... Any ideas? Here is the traceback after I interrupted the process: http://pastebin.com/QV63cBPG

run the command `dmesg` and see if you see anything that OOM has killed mysql — Mike, Jan 08 '16 at 15:33
More of the MySQL log would be helpful. And you can prevent WP from being inaccessible with webserver level caching, or solutions like Varnish or CloudFront. — JayMcTee, Jan 08 '16 at 15:44
@Mike I'm not seeing anything from OOM in dmesg or anything about MySQL. — anguiac7, Jan 08 '16 at 15:46
@JayMcTee I'll probably go for Varnish since that will handle the whole server. To my knowledge, you have to individually add sites to CloudFront and you can't just add an entire server, correct? — anguiac7, Jan 08 '16 at 15:46
Per host yes I believe so. But of course to create a cached copy, you still need MySQL to be accessible at times. — JayMcTee, Jan 08 '16 at 15:48
Another option to prevent going offline is to install monit and let it monitor mysql. When it goes down, it will restart it. You should configure it for Apache as well. — SPRBRN, Jan 08 '16 at 15:52
@SPRBRN I'll look into doing that. Thank you! What's weird is that it doesn't appear that MySQL is going down - it's just switching to port 0 from what I can tell. I could be wrong though. — anguiac7, Jan 08 '16 at 15:52
Myslq starts on port 0 if network is down I believe. Maybe it's related to a network issue? — SPRBRN, Jan 08 '16 at 16:10
@SPRBRN Would the network issues not show up in the error log? I'm so confused as to why it's doing this. — anguiac7, Jan 08 '16 at 17:45
What does "becomes inaccessible" really mean? Not accessible over network? When it happens, is that possible to connect with *mysql* client over UNIX socket? What is the value of *bind-address* in *my.cnf*? — sam_pan_mariusz, Jan 08 '16 at 18:46
@sam_pan_mariusz I don't allow external connections to MySQL so the local sites are connected through UNIX socket. The bind-address is 127.0.0.1. — anguiac7, Jan 08 '16 at 19:03
I don't know what tklbam-restore is about, and how it affects Mysql. Maybe you can explain a bit. For my wordpress sites I use backwpup, a wordpress plugin, that can back up locally, and to dropbox / s3 / google drive etc. Maybe this doesn't satisfy your need to know the cause of the problem, and still you need to disable the current wordpress backup, but it may be a good enough solution. — SPRBRN, Jan 09 '16 at 11:03
@SPRBRN I'm running the TunkeyLinux LAMP distro (https://www.turnkeylinux.org/lampstack) which includes tklbam (https://www.turnkeylinux.org/docs/tklbam). Maybe I'll ask around on their forums. I'm not sure how active the TK community is on serverfault. — anguiac7, Jan 09 '16 at 20:29
@anguiac7 - I just came across this post. Can you please contact me directly via email so we can try to debug this issue (if you haven't already resolved it): jeremy AT turnkeylinux.org — Jeremy Davis, May 24 '16 at 02:32

score 1 · Answer 1 · answered Jan 08 '16 at 18:14

Try starting MySQL using strace and store the output to a file. Then review the output right before it terminates to see if there's anything which would point to the cause of the issue.

Be warned though that the output can grow quite large, so make sure you don't run out of disk space or otherwise negatively impact the system (such as if it requires a lot of I/O to write all the data to the disk).

If you find that strings were cut off that would be helpful to have to investigate further, use the -s argument to strace.

If it's easier, you can attach strace to an existing process using -p processid.

TKLBAM backup stalling causing MySQL issues

1 Answers1