2

I work at a fairly big website; we have ~400-500 million page views a month. We use PHP and MySQL.

Currently our page flow works like this (semi pseudo code for clarity):

mysql_connect();
mysql_select_db('red');
mysql_query('SELECT * FROM apples');
mysql_query('SELECT * FROM cakes');

One of my co-workers suggested that mysql_select_db slows down page loads and increases the load of the DB server, and suggested to change our "flow" to be like:

mysql_connect();
mysql_query('SELECT * FROM red.apples');
mysql_query('SELECT * FROM red.cakes');

Is it true that it will make a difference? I'm especially concerned about the time it will take to make this change; will we see any noticeable results?

Is it worth it?

anon
  • 23
  • 2

3 Answers3

9

Benchmark:

$ cat 1.php 
<?php
mysql_connect('localhost') or die(mysql_error());
mysql_select_db('test') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM foo') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM bar') or die(mysql_error());
?>
$ cat 2.php 
<?php
mysql_connect('localhost') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM test.foo') or die(mysql_error());
mysql_query('SELECT SQL_NO_CACHE * FROM test.bar') or die(mysql_error());
?>
$ time ( for i in {1..100} ; do php 1.php; done;)

real    0m3.554s
user    0m2.300s
sys     0m1.188s
$ time ( for i in {1..100} ; do php 2.php; done;)

real    0m3.555s
user    0m2.292s
sys     0m1.208s

I call bullshit.

Is your coworker perhaps confused with the mysql command-line client, which will load all table & columnnames if you switch to a database (which can be avoided with the -A switch)?

Wrikken
  • 69,272
  • 8
  • 97
  • 136
0

i don't know for sure about your question but i doubt it would make significant difference.

but there's other things that can make. i suggest you write all the column names instead of using *. this i know for sure that will speed up your queries.

ex:

mysql_query('SELECT id, size, color FROM apples');

other thing you can do is to use LIMIT correctly. for instance, if you are selecting an user from database, and you KNOW for sure that it's unique, use LIMIT 1 at the end of the query.

ex:

SELECT id, username, access_level FROM users WHERE id = ? LIMIT 1
Hugo Mota
  • 11,200
  • 9
  • 42
  • 60
  • I can't imagine this would ever make enough of a difference to be worthwhile, especially if you have more than three or four columns. – bnaul Sep 17 '10 at 01:58
  • Not from a performance point of view perhaps, but certainly with bugtracking (i.e.: table changes fail much more transparantly). Giving you that much more time to work on better causes :P – Wrikken Sep 17 '10 at 02:02
  • i ran some tests over big loops and the difference appeared to be significant. like half the waiting time. maybe i'm not testing it right. how should i ran the tests? – Hugo Mota Sep 17 '10 at 02:08
  • at preferable an idle server, with query_cache disabled, etc. fwiw: if I run the test while selecting _all_ columns as in my other question, `select *` is about 2% slower, not really that much gain. When you can avoid BLOB / TEXT columns when you don't need them the gain could however be significant. – Wrikken Sep 17 '10 at 13:24
0

I doubt very much it would make a difference performance wise if you add the database name.

Persistent connections and refactoring your queries to never, EVER select * would be a good first step, I think. Then you might want to think about using query caching, and having a look at your slow query log. Those are going to help you more than some small semantic difference like specifying the schema name ever could.

Anthony
  • 71
  • 4