0

I've a LAMP stack running Percona MySQL.

Via New Relic, I'm monitoring/tweaking/tuning my new server ready for it to be used live.

Something that's irritating me though, is that I get periodic (usually after periods of inactivity) where the DB response time (for simple queries that normally respond within a few ms) spike and spoil my averages.

It's as if the cache is being paged, but it isn't. It's got 3GB of ram, hovers at around 512MB consumption and 0 bytes paged.

See graph.

Any idea what could be causing this? It's not as likely to be an issue in production, as there'll be a constant flow of traffic keeping it all alive, but I'd still like to investigate in case there's something afoot.

Thanks.

enter image description here

i-CONICA
  • 648
  • 1
  • 9
  • 22

3 Answers3

0

Just a shoot: Perhaps resolving the DNS name of the database server.

slm
  • 7,615
  • 16
  • 56
  • 76
neutrinus
  • 1,125
  • 7
  • 18
  • The DB server is localhost, and Apache DNS name resolution is disabled anyway. The application connects to localhost:3306 Could this still be an issue then? – i-CONICA Feb 24 '14 at 10:27
  • Also, could that really take 500ms? :S Thanks. – i-CONICA Feb 24 '14 at 10:28
  • Yeah, it could take even more. Please setup a network sniffer (like tcpdump) to check what is going on at the network level. – neutrinus Feb 25 '14 at 07:10
0

My best guess, without closer inspection, would be either automatic vacuuming or journaled transactions sync'ing to disk. You said the errant behavior is periodic, whats the period? Does the period coincide with any mysql tunables?

etherfish
  • 1,757
  • 10
  • 12
  • Hi, Thanks. There is no regularity to the period, it seems to be after a period of low or complete inactivity. If I keep the server active with regular requests, it seems to "keep it alive" and it remains consistently fast. These blips seem to "wake" up the database. Would it be helpful to see the contents of my.cnf? – i-CONICA Feb 24 '14 at 11:12
0

DNS might be the problem indeed. Even if MySQL is on localhost it does a DNS request if there is no skip-name-resolve in my.cnf. And yes, it may take 500ms and more.

I would collect slow log with long_query_time=0, and analyzed the log with pt-query-digest. It will reveal possible problems with the queries like locking time longer than usual, wrong query execution plan, occasional long IO requests.

If spikes repeat daily collect the log around that time.

You can also increase the slow log verbosity, it will give more details for analysis.

Compare the query response time graph with InnoDB checkpoint age graph - maybe InnoDB does storm flushing at that time?

akuzminsky
  • 738
  • 4
  • 9
  • Hi, Thanks. I do have skip-name-resolve specified in my.cnf. Will having the application connect to 127.0.0.1:3306 be better than localhost? – i-CONICA Feb 25 '14 at 12:09
  • Hi, I've found this from the DB: Innodb_row_lock_timeDocumentation 10.2 k The total time spent in acquiring row locks, in milliseconds. Innodb_row_lock_time_avgDocumentation 24 The average time to acquire a row lock, in milliseconds. Innodb_row_lock_time_maxDocumentation 5.1 k The maximum time to acquire a row lock, in milliseconds. Innodb_row_lock_waitsDocumentation 409 The number of times a row lock had to be waited for. So it seems like some InnoDB locking issue. :S – i-CONICA Feb 25 '14 at 13:45
  • "127.0.0.1:3306 be better than localhost", - although TCP introduces more overhead than Unix socket it's never the problem comparing to other performance challenges – akuzminsky Feb 25 '14 at 14:12