0

I'm a bit confused about how to optimize my SQL queries. I have a mid-complex query with some joins to run a hundred to thousand times per second (two tables on SSD, one table in RAM, 4 JOINs)

How can I minimize the overhead for execution? Is there any way to precompile a query so that MySQL does not need to analyze, optimize and compile the query each single time?

I know I can use a prepared statement to precompile a query that is then being executed multiple times in the same session. But what to do if you have multiple sessions and only one query per session? Are prepared statements being cached across different sessions? I don't think so.

Then I thought that stored procedures were the best way, because they're said to be precompiled. Now I read that this assumption is totally wrong, they are in fact not precompiled.

Is the any way to share client-sessions in MySQL, e.g. to use prepared statements in the first session inherited by the following sessions?

Last idea is to write a multithreaded socket server to act like a MySQL client-proxy. But that seems a little bit exaggerative to me. ;-)

I use PHP as Apache2 module. Is there any chance to "store" the MySQL session in shared memory, so that the following HTTP requests can use the existing MySQL session instead of starting a new one? So that I can use prepared statements across different HTTP requests?

halfer
  • 19,824
  • 17
  • 99
  • 186
McJoey
  • 2,388
  • 1
  • 12
  • 7
  • 1
    "run a hundret to thousand times per second" -- care to expand. Not the usual web app –  May 09 '15 at 22:26
  • 1
    You seem to be really focusing on sharing/caching the compiled queries across sessions but this is only going to increase performance if your bottleneck is due to compilation rather than I/O. Is indexing and caching out of the question here? Why are your queries running 1000 times per second? I think we need to know more about your actual problem to help. – Phil May 09 '15 at 22:51
  • use memcache. it always save your life. – hakki May 10 '15 at 00:01
  • 1
    @lcjury that's absolutely incorrect, unless you are using queries with derived tables, which create implicit temporary tables. An ordinary inner or outer join (or even several in the same query) does no such thing. – Michael - sqlbot May 10 '15 at 03:20
  • Stores procedures *are* precompiled, in a sense, but not at declaration time. The first run of a given procedure by a given session is when this occurs. – Michael - sqlbot May 10 '15 at 03:23
  • There are _far too many_ answers to this question. Please provide a small number of SQL statements for us to make suggestions about. – Rick James May 11 '15 at 00:27
  • @RickJames: please excuse my bad english and if my question was too unclear. I'm quite experienced in writing highly optimized sql queries. My question was related to the overhead each/any query causes to the underlying interfaces and the mysql query optimizer itself and how this overhead can be reduced. MySQL offers some features to optimze the query overhead within the actual session, what makes sense if you have many queries in one session. But not with a lot of sessions and always the same query per session. – McJoey May 11 '15 at 01:34
  • @Michael-sqlbot: I already make use of persistent connections and PDO/mysqlnd, it seems (and I also read it somewhere) that the state of each session remains unchanged after php script completion. So this procedure should already be compiled if such a connection will be re-used. Is there any way how to prove this? – McJoey May 11 '15 at 01:47
  • http://dev.mysql.com/doc/internals/en/sp-cache-sp.html ... one somewhat roundabout way to prove this: If you run a procedure, then, in a different thread, alter a table used by that procedure in certain ways, the procedure will fail on subsequent executions until you either disconnect/reconnect, or flush the table. – Michael - sqlbot May 11 '15 at 01:58

3 Answers3

3

Q: Is there a way to "re-use" a MySQL connection so that a subsequent request can make use an existing connection?

A: Yes. You can use a connection pool implementation. This is a familiar pattern with Java, with several implementations available.

For a connection pool implementation in PHP, you can use the PHP extension mysqldnd-ms.

Reference: http://php.net/manual/en/mysqlnd-ms.pooling.php

NOTE: I don't have personal experience with this PHP extension.


Some of the other questions you asked...

Q: How can I minimize the overhead for execution? Is there any way to precompile a query so that mysql does not need to analyze, optimize and compile the query each single time?

A: In MySQL 5.6, you can use server side prepared statements. The execution plan for the prepared statement is cached in the session, so a repeated call to the same SQL statement can re-use the previously prepared execution plan. (This feature is not available in MySQL versions before 5.6.)

Reducing the amount of "connection churning** will reduce MySQL overhead. Connecting and disconnecting from the database server is work that the server has to do. It's simple enough to test and compare performance. In one process, open a connection, and do some repeated work (repeated execute a simple statement, like SELECT NOW(), and then disconnect. In another process, run the same repeated executions of the SELECT, but connect and disconnect for each execution.

Q: Are prepared statements being cached across different session?

A: No. Statements are cached at the statement session level.

Q: Is the any way to share client-sessions in mysql, e.g. to use prepared statements in the first session inherited by the following sessions?

A: No. The only way you're going to get that to happen is to not disconnect from the database, and pass the handle of that session to a subsequent client that requests a connection. And we get that to happen by implementing a connection pool.


spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • Thank you very much, this is exactly what I wanted to know. However, I don't really understand why server side prepared statements are being cached in the session instead globally, so a true advantage is only given in combination with persistent connections. Anyway, that really sounds promising, as well as the connection pooling. Thanks again! – McJoey May 11 '15 at 01:58
0

I would say that one of the best practices to do that is to avoid Queries and functions based on O(n) time. In order to be able to perfom well under high load data, all the queries must be in O(1) extra time. What that actually means is, not matter how many data you are fetching, the query will always require the same time. (Linear Equation)

What I mean with that is, if you store IDS in a $_SESSION var, you can later use it whenever you need reducing the time of the queries from O(n) to O(1).

Llogari Casas
  • 942
  • 1
  • 13
  • 35
  • While I agree with most of what you are saying, making all your queries O(1) is easier said than done and unless you are doing insert queries to a serial log, it's normally impossible. With an indexed SELECT you can get O(log(n)) where an increase in data will effect speed less, but you can seldom completely stop query slowdown as your data grows. – Phil May 09 '15 at 23:02
  • I completly agree with what you are saying. In fact, I know how hard it is to achieve that. However, if we are dealing with such an amount of data, we could try to look for a more scalable solution. Such as NoSQL. I would say that a Document based NoSQL system would work perfeclty in that scenario @Phil_1984_ – Llogari Casas May 09 '15 at 23:09
  • O(1) is unfortunately not possible for us :-) But the queries are not time-sensitive, i can modify time intervals and may even drop queries when running out of resources. – McJoey May 11 '15 at 02:00
0

If you are bogged down in reads, Slaves can allow almost unlimited read-scaling.

Other issues: - Batching INSERTs helps a lot. - InnoDB is better at concurrent access. - Cross-country latency can lead to delays that can be mitigated by Stored Procedures (and other techniques). - In general, the overhead you are asking about is less important than query optimizations. - SPs sometimes help because of less latency. - *Nix is better than Windows. - Too many 'simultaneous' connections can be counter-productive. - PHP Sessions are unlikely to be useful.

Rick James
  • 135,179
  • 13
  • 127
  • 222