16

I've done quite a bit of reading before asking this, so let me preface by saying I am not running out of connections, or memory, or cpu, and from what I can tell, I am not running out of file descriptors either.

Here's what PHP throws at me when MySQL is under heavy load:

Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (11 "Resource temporarily unavailable")

This happens randomly under load - but the more I push, the more frequently php throws this at me. While this is happening I can always connect locally through the console and from PHP through 127.0.0.1 instead of "localhost" which uses the faster unix socket.

Here's a few system variables to weed out the usual problems:

cat /proc/sys/fs/file-max = 4895952
lsof | wc -l = 215778 (during "outages")

Highest usage of available connections: 26% (261/1000)

InnoDB buffer pool / data size: 10.0G/3.7G (plenty o room)

  • soft nofile 999999
  • hard nofile 999999

I am actually running MariaDB (Server version: 10.0.17-MariaDB MariaDB Server)

These results are generated both under normal load, and by running mysqlslap during off hours, so, slow queries are not an issue - just high connections.

Any advice? I can report additional settings/data if necessary - mysqltuner.pl says everything is a-ok

and again, the revealing thing here is that connecting via IP works just fine and is fast during these outages - I just can't figure out why.

Edit: here is my my.ini (some values may seem a bit high from my recent troubleshooting changes, and please keep in mind that there are no errors in the MySQL logs, system logs, or dmesg)

socket=/var/lib/mysql/mysql.sock
skip-external-locking
skip-name-resolve
table_open_cache=8092
thread_cache_size=16
back_log=3000
max_connect_errors=10000
interactive_timeout=3600
wait_timeout=600                                                                                            
max_connections=1000
max_allowed_packet=16M
tmp_table_size=64M
max_heap_table_size=64M
sort_buffer_size=1M
read_buffer_size=1M
read_rnd_buffer_size=8M
join_buffer_size=1M
innodb_log_file_size=256M
innodb_log_buffer_size=8M
innodb_buffer_pool_size=10G

[mysql.server]
user=mysql

[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
open-files-limit=65535
Chad E.
  • 1,196
  • 9
  • 12
  • What is your disk I/O like? if your bottleneck isn't hit on memory, cpu or connections, it's most likely related to disk I/O under load not keeping up with .sock. Have you tried not using a socket? – user3036342 Mar 03 '15 at 10:14
  • At my absolute worst there was still 0% iowait (and html pages serve nice and fast, console is fast, etc, so not a disk IO problem) - I can try not using the local socket - but that just breeds network issues by introducing a bunch more load the the TCP stack of an already busy server. I would prefer to stay with the faster and recommended method of local sockets. – Chad E. Mar 03 '15 at 10:19
  • Its a possible bug. Try to change your version – A. Blub Mar 05 '15 at 13:27
  • How do you run your PHP code ? For example PHP-FPM has a setting for the maximum limit of file descriptors, I'm not sure if this affects only the master process or if it's shared across all worker processes, but you may want to check that out. –  Mar 05 '15 at 13:44
  • could you post your `my.ini` contents (without comment lines) please? – Alex Mar 05 '15 at 14:58
  • Did you ever check the concurrent connection limit at MySQL? – Santa's helper Mar 06 '15 at 14:18
  • @A.Blub - I am running the latest available packages. – Chad E. Mar 06 '15 at 16:08
  • 1
    @AndréDaniel - I made a simple test script to try and connect and output any errors, then I ran it from the command line to remove web services from the equation. I also used mysqlslap at 3am to generate the load to nullify incoming connection or complex query issues. Bottom line is something is going wrong with socket connections between PHP and Mysql under medium size load - but performance and TCP connections are not affected. – Chad E. Mar 06 '15 at 16:11
  • @Alex - I've posted it above – Chad E. Mar 06 '15 at 16:18
  • As i said. Change your version to check it. I had already some stupid problems with highload and the current "stable" version. I had a workaround until i understood, thats only a bug... – A. Blub Mar 06 '15 at 16:47
  • To rule out potential issues with MySQL or PDO, try making a simple server listening on an Unix socket and then a PHP client for it (just connect to the socket) and then run multiple instances of that client while loading the machine. If you can reproduce the issue with this it means there's something wrong with PHP's handling of UNIX sockets and MySQL isn't even at fault. –  Mar 06 '15 at 17:06
  • @AndréDaniel - I have created a perl client to connect to the MySQL server via sockets and report errors, I am going to test it under load tonight, that will at least tell me if it's PHP or not. I'll continue with a general socket test once PHP has been ruled out. I also use sockets between Nginx and PHP-FPM without any issues. It doesn't rule out an OS issue, but it does give me some evidence. – Chad E. Mar 06 '15 at 18:34
  • @AndréDaniel - I just remembered that Zabbix (which uses the local socket) reports MySQL down during these issues, so it is definitely not a php related issue, rather a MySQL/Linux sockets issue. – Chad E. Mar 06 '15 at 19:48

3 Answers3

10

Most likely it is due to net.core.somaxconn What is the value of /proc/sys/net/core/somaxconn

net.core.somaxconn 

# The maximum number of "backlogged sockets".  Default is 128.

Connections in the queue which are not yet connected. Any thing above that queue will be rejected. I suspect this in your case. Try increasing it according to your load.

as root user run

echo 1024 > /proc/sys/net/core/somaxconn 
DBHash.com
  • 584
  • 2
  • 7
  • It was set to 4096, I had changed it yesterday when I found it posted as a solution for a php-fpm/nginx issue with local sockets. Now that traffic is low, I had a chance to try it again, and I discovered that the errors have gone away! I'm going to award you the bounty on this, because I believe you are right! This was most likely the culprit, thanks! – Chad E. Mar 07 '15 at 14:09
  • 1
    This was the final list of settings changed regarding net.core: net.core.somaxconn = 4096 net.core.netdev_max_backlog = 4096 net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 – Chad E. Mar 07 '15 at 14:14
  • Good to know that your problem is fixed. Thank you – DBHash.com Mar 07 '15 at 15:14
0

This is something that can and should be solved by analysis. Learning how to do this is a great skill to have.

Analysis to find out just what is happening under a heavy load...number of queries, execution time should be your first step. Determine the load and then make the proper db config settings. You might find you need to optimize the sql queries instead!

Then make sure the PHP db driver settings are in alignment as well to fully utilize the database connections.

Here is a link to the MariaDB threadpool documentation. I know it says version 5.5, but its still relevant and the page does reference version 10. There are settings listed that may not be in your .cnf file that you can use.

https://mariadb.com/kb/en/mariadb/threadpool-in-55/

Reenactor Rob
  • 1,508
  • 1
  • 11
  • 20
  • I appreciate your candor, but I have been doing this for quite a while, followed best practices, read many books on the subject, and have never encountered this issue before. I am posting here because regardless of query (as evidenced by using only mysqlslap) to put moderate ( <25% cpu load) on the machine, I am getting these errors from PHP-FPM, all the while the database responds quickly via TCP or command line. optimization of the database has no effect on this phenomenon. I don't post questions here often, only the really difficult stuff. – Chad E. Mar 05 '15 at 17:29
  • I don't mean to be rude, but I am not looking for "read the manual" as a solution. I am not running out of CPU, RAM, or file descriptors, the Database is not getting slow, it responds wonderfully. There are no errors in dmesg, or mysql's log. This is likely an OS or a php issue, I'm not sure which one, I'm here to get some help because after extensive research I am stumped - I have read the manuals. – Chad E. Mar 05 '15 at 17:38
  • This methodology is what I use to solve such issues. I only meant for you review the settings not RTFM. Suit yourself. – Reenactor Rob Mar 05 '15 at 18:23
  • Maybe you should also ask in the ServerFault site of Stackexchange? (or instead) – Lexib0y Mar 05 '15 at 19:30
  • @Rob, I'm sorry, I'm just frustrated. I don't know where to look now that I've pored over every manual and forum I know of, and the only error I have is vague. I was hoping someone here would have experienced the same problem and had a solution. – Chad E. Mar 06 '15 at 00:04
  • @Lexib0y, thanks - good idea - I've posted on the DBA forum at stack exchange. – Chad E. Mar 06 '15 at 00:18
0

From the top of my head, I can think of max_connections as a possible source of the problem. I'd increase the limit, to at least eliminate the possibility.

Hope it helps.

Théo T. Carranza
  • 7,813
  • 1
  • 21
  • 14
  • Thanks for the answer. I mentioned above that I am tracking that - Highest usage of available connections: 26% (261/1000) – Chad E. Mar 06 '15 at 13:42