19

I have large MySQL query (1.8M rows, 25 columns) and I need to make 2 dimensional array from it (memory table based on primary key).

Code works as expected, but $table creation takes a long time in PHP7.0.

What is the reason why PHP7.0 performs so much worse? My primary interest is in mysqli.

Thank you for any insights - PHP7 would save me much memory if I can fix performance.

mysqli code snippet

$start = microtime(true);

$vysledek = cluster::query("SELECT * FROM `table` WHERE 1");
$query_time = (microtime(true) - $start);
$start_fetch = microtime(true);
while($zaznam = mysqli_fetch_assoc ( $vysledek )){
  $fetch_time+= (microtime(true) - $start_fetch);
  $start_assign = microtime(true);
  $table[$zaznam['prikey']] = $zaznam;
  $assign_time+= (microtime(true) - $start_assign);
  $start_fetch = microtime(true);
}
$total_time+= (microtime(true) - $start);
echo round($assign_time, 2).' seconds to set the array values\n';
echo round($query_time, 2).' seconds to execute the query\n';
echo round($fetch_time, 2).' seconds to fetch data\n';
echo round($total_time, 2).' seconds to execute whole script\n';
echo "Peak Memory Usage:".round(memory_get_peak_usage(true)/(1024 * 1024), 2)." MB\n";

mysqli results

Deb 7 PHP 5.4 mysqlnd 5.0.10
1.8 seconds to set the array values
8.37 seconds to execute the query
13.49 seconds to fetch data
24.42 seconds to execute whole script
Peak Memory Usage:8426.75 MB

Deb 8 PHP 5.6 mysqlnd 5.0.11-dev
1.7 seconds to set the array values
8.58 seconds to execute the query
12.55 seconds to fetch data
23.6 seconds to execute whole script
Peak Memory Usage: 8426.75 MB

Deb 8 PHP 7.0 mysqlnd 5.0.12-dev
0.73 seconds to set the array values
8.63 seconds to execute the query
126.71 seconds to fetch data
136.46 seconds to execute whole script

Peak Memory Usage:7394.27 MB

Deb 8 PHP 7.0 mysqlnd 5.0.12-dev extended benchmarking

I have extended benchmarking for section fetch to report every 100k lines with following results:

Lines fetched 100000 in 1.87s
Lines fetched 300000 in 5.24s
Lines fetched 500000 in 10.97s
Lines fetched 700000 in 19.17s
Lines fetched 900000 in 29.96s
Lines fetched 1100000 in 43.03s
Lines fetched 1300000 in 58.48s
Lines fetched 1500000 in 76.47s
Lines fetched 1700000 in 96.73s
Lines fetched 1800000 in 107.78s

DEB8 PHP7.1.0-dev libclient 5.5.50

1.56 seconds to set the array values
8.38 seconds to execute the query
456.52 seconds to fetch data
467.68 seconds to execute whole script

Peak Memory Usage:8916 MB

DEB8 PHP7.1.0-dev libclient 5.5.50 extended benchmarking

Lines fetched 100000 in 2.72s
Lines fetched 300000 in 15.7s
Lines fetched 500000 in 38.7s
Lines fetched 700000 in 71.69s
Lines fetched 900000 in 114.8s
Lines fetched 1100000 in 168.18s
Lines fetched 1300000 in 231.69s
Lines fetched 1500000 in 305.36s
Lines fetched 1700000 in 389.05s
Lines fetched 1800000 in 434.71s

DEB8 PHP7.1.0-dev mysqlnd 5.0.12-dev

1.51 seconds to set the array values
9.16 seconds to execute the query
261.72 seconds to fetch data
273.61 seconds to execute whole script

Peak Memory Usage:8984.27 MB

DEB8 PHP7.1.0-dev mysqlnd 5.0.12-dev extended benchmarking

Lines fetched 100000 in 3.3s
Lines fetched 300000 in 13.63s
Lines fetched 500000 in 29.02s
Lines fetched 700000 in 49.21s
Lines fetched 900000 in 74.56s
Lines fetched 1100000 in 104.97s
Lines fetched 1300000 in 140.03s
Lines fetched 1500000 in 180.42s
Lines fetched 1700000 in 225.72s
Lines fetched 1800000 in 250.01s

PDO code snippet

$start = microtime(true);
$sql = "SELECT * FROM `table` WHERE 1";
$vysledek = $dbh->query($sql, PDO::FETCH_ASSOC);
$query_time = (microtime(true) - $start);
$start_fetch = microtime(true);
foreach($vysledek as $zaznam){
  $fetch_time+= (microtime(true) - $start_fetch);
  $start_assign = microtime(true);
  $table[$zaznam['prikey']] = $zaznam;
  $assign_time+= (microtime(true) - $start_assign);
  $start_fetch = microtime(true);
}
$total_time+= (microtime(true) - $start);
echo round($assign_time, 2).' seconds to set the array values\n';
echo round($query_time, 2).' seconds to execute the query\n';
echo round($fetch_time, 2).' seconds to fetch data\n';
echo round($total_time, 2).' seconds to execute whole script\n';
echo "Peak Memory Usage:".round(memory_get_peak_usage(true)/(1024 * 1024), 2)." MB\n";

PDO Results

Deb 7 PHP 5.4 mysqlnd 5.0.10
1.85 seconds to set the array values
12.51 seconds to execute the query
16.75 seconds to fetch data
31.82 seconds to execute whole script
Peak Memory Usage:11417.5 MB

Deb 8 PHP 5.6 mysqlnd 5.0.11-dev
1.75 seconds to set the array values
12.16 seconds to execute the query
15.72 seconds to fetch data
30.39 seconds to execute whole script
Peak Memory Usage:11417.75 MB

Deb 8 PHP 7.0 mysqlnd 5.0.12-dev
0.71 seconds to set the array values
35.93 seconds to execute the query
114.16 seconds to fetch data
151.19 seconds to execute whole script

Peak Memory Usage:6620.29 MB

Baseline comparison code

 $start_query = microtime(true);
 exec("mysql --user=foo --host=1.2.3.4 --password=bar -e'SELECT * FROM `profile`.`table`' > /tmp/out.csv");
 $query_time = (microtime(true) - $start_query);
 echo round($query_time, 2).' seconds to execute the query \n';

Execution time is similar for all systems at 19 seconds +-1 second variation.

Based on above observations I would say that PHP 5.X is reasonable as there is a bit more work executed than just dumping to the file.

  • all 3 servers are on same host (source and both test servers)
  • tests are consistent when repeated
  • there is already similar variable in memory ,I need to do it for comparison removed for testing, is not related to the problem
  • CPU is at 100% whole time
  • Both servers have 32G RAM and swappiness set to 1, goal is to perform it as memory operation
  • test server is dedicated, there is nothing else running
  • php.ini changed between major versions but all options relating to mysqli/PDO seems to be the same
  • Deb8 machine was downgraded to PHP5.6 and issue disappeared, after reinstalling PHP7 its back

  • Reported a bug at php.net - ID 72736 since I belive that it was proven that problem is in PHP and not in the system or any other configuration

Edit 1 : Added PDO Comparison

Edit 2 : Added benchmarking markers, edited PDO results as there was benchmarking error

Edit 3 : Major cleanup in original question, rebuild of code snipets for better indication of the error

Edit 4 : added point about Downgrade and upgrade of PHP

Edit 5 : added extended benchmarking for DEB8 PHP7.0

Edit 6 : included php7 config

Edit 7 : performance measurement for PHP 7.1 dev with both libraries- compiled with configs from bishop, removed my php-config

Edit 8 : added comparison against CLI command, minor clean-ups

Zmrzka
  • 310
  • 1
  • 11
  • Just curious: have you tried the same with PDO? – Matt S Jul 27 '16 at 13:57
  • Just to be clear, you're loading the entire table contents and storing it all in a big fat array? And storing a second similar sized array in memory as well? That surely can't be the most efficient way of doing it. – Simba Jul 27 '16 at 15:27
  • 1
    Actualy it is extremly efficient, in PHP5.4 you can compare 2 tables like this from 2 different DBs in 40 - 60 seconds. At the same time operation is non-blocking for databases, you can do separate inserts/deletes/updates and original application can run above those tables all the time. Another point is that you can do it on different server so eaven if you exhaust memory you do not kill target database. – Zmrzka Jul 27 '16 at 16:59
  • 1
    Are all of the mysqli settings in your `php.ini` file the same in PHP7 as they are in PHP5? Did you install x86 or x64 for both? If you put PHP5 on Debian8 then does the same performance difference exist? Are you using the same MySQL driver on both servers? – MonkeyZeus Jul 27 '16 at 17:59
  • What is the purpose of `WHERE 1`? – MonkeyZeus Jul 27 '16 at 18:04
  • Please update your question to show exactly where you placed your benchmarking code using `microtime(true)` – MonkeyZeus Jul 27 '16 at 18:06
  • 1
    @MonkeyZeus php.ini is identical, server is (x64) and just cloned and upgraded so I could use apt-get to make easy installation of PHP7 , where 1 is just to show that its full scan - it can be omited and have no impact on query – Zmrzka Jul 27 '16 at 18:19
  • I see, please let me know the answer to the rest of my questions. In the meantime I am going to post an answer requesting for you to try a testing procedure – MonkeyZeus Jul 27 '16 at 18:21
  • Is it possible you have xdebug on for the php7 case? – Evert Jul 27 '16 at 18:22
  • What is the value of primary key? i.e., is it a string, an integer, and what size? – LSerni Jul 27 '16 at 18:32
  • If you install MySQL workbench then do those query and fetch times remain the same? – MonkeyZeus Jul 27 '16 at 18:34
  • @lserni in some cases its a integer , in some cases its a string,but its allways cast as a string – Zmrzka Jul 27 '16 at 18:49
  • @Zmrzka Have you tried the code in my answer? – MonkeyZeus Jul 27 '16 at 19:10
  • @MonkeyZeus, PHP 7 benchmark is running, after it finishes I will run PHP5.4, I can't do both at once - its a massive memoryhog – Zmrzka Jul 27 '16 at 19:15
  • @Evert if needed than probably yes but would have to investigate it first and than learn how to insatll it – Zmrzka Jul 27 '16 at 19:21
  • @MattS PDO comparsion is now included in original post – Zmrzka Jul 27 '16 at 19:23
  • 1
    Are you using `libmysqlclient` or `mysqlnd`? (The `configure` command from `php -i` will help.) The engine code goes through different paths depending upon this choice. – bishop Jul 27 '16 at 20:02
  • @bishop added in original post – Zmrzka Jul 28 '16 at 16:35
  • Those results are really strange, PHP7 takes almost 10 times as long to fetch the data which could suggest a 100Mb vs. 1,000Mb ports but you are using the same box so that theory is obviously out. Can you try benchmarking `$all_rows = mysqli_fetch_all($vysledek, MYSQLI_ASSOC); foreach($all_rows as $zaznam){ $table[$zaznam['prikey']] = $zaznam; }`? I've personally not been able to flaw PHP7 yet so this issue is really interesting to me. You are connecting to the same exact DB regardless of PHP version right? – MonkeyZeus Jul 28 '16 at 17:52
  • And if you rebuild using `libmysqlclient`, do the times improve? – bishop Jul 28 '16 at 18:17
  • @MonkeyZeus 3.12 seconds to set the array values 8.46 seconds to execute the query 124.11 seconds to fetch data 135.69 seconds to execute whole script Peak Memory Usage:7386.29 MB – Zmrzka Jul 28 '16 at 21:12
  • @MonkeyZeus DB is allways the same, comunication is thru E1000 interface (emulation of 1G interface), I will try to migrate it to VMXNET 3 (can handle 10Gb) but I will have to shut down primary DB server for that and I am not sure how fast I can do that. Still it seems that limitation is client CPU(its at 100%) . DB CPU rises barely by 1%, all data are in innodb pagefile so there are no drive operations involved – Zmrzka Jul 28 '16 at 21:31
  • @bishop this is above my level of linux skill, I found that I need to set some options when compiling php but it allways ends in error. I am not sure how to do it. if there is some reasonably complex guide I am willing to try again – Zmrzka Jul 29 '16 at 13:47
  • I would also get the full PHP info with `php -i` from both PHP installations, and compare. See if you have extra extensions enabled on PHP 7 that could affect performance. – Bill Karwin Jul 29 '16 at 14:02
  • @BillKarwin easier said than done when I have no idea in most cases what could affect performance in there , PHP7.0 have one more plugin loaded and that is auth_plugin_sha256_password, but I dont think that this is relevant. as for the rest ... there is 299 removals and 266 additions between those 2 versions, no way I can evaluate that wihout knowladge what it does. Please note that I i installed only package PHP7.0 and nothing else, everything is in default settings – Zmrzka Jul 29 '16 at 14:19
  • I am at a total loss of ideas. My last-ditch thought is to try PHP 7.1.0 Beta and see if you have something to look forward to once it is officially released. – MonkeyZeus Jul 29 '16 at 20:29
  • @MonkeyZeus done, and it was extremly diasapointing - its eaven worse – Zmrzka Jul 31 '16 at 06:38

2 Answers2

4

For cross-reference: With the release of PHP 7.1 on 1st Dec 2016 this issue should be resolved (in PHP 7.1).

PHP 7.0: Even in the ticket it's written that PHP-7.0 has been patched, I've not yet seen in the recent change-log (7.0.13 on 10 Nov 2016, since patch incorporation date) that this is part of the current PHP 7.0.x release. Perhaps with the next release.

The bug is tracked upstream (thanks to OP's report): Bug #72736 - Slow performance when fetching large dataset with mysqli / PDO (bugs.php.net; Aug 2016).

hakre
  • 193,403
  • 52
  • 435
  • 836
1

As the problem appears to be in the fetch (not the array creation), and we know the driver is running mysqlnd (which is a driver library independently written by the PHP team, not provided by MySQL AB aka Oracle), then recompiling PHP using libmysqlclient (which is the MySQL AB aka Oracle provided interface) may improve the situation (or at least narrow the problem space).

First thing I'd suggest is writing a small script that can be run from the CLI that demonstrates the problem. This will help to eliminate any other variables (web server modules, opcache, etc).

Then, I'd suggest rebuilding PHP with libmysqlclient to see if performance improves. Quick guide to rebuilding PHP (for the technically competent):

  1. Download the source for the PHP version you want
  2. Decompress and go into the PHP code directory
  3. Run ./buildconf
  4. Run ./configure --prefix=/usr --with-config-file-path=/etc/php5/apache2 --with-config-file-scan-dir=/etc/php5/apache2/conf.d --build=x86_64-linux-gnu --host=x86_64-linux-gnu --sysconfdir=/etc --localstatedir=/var --mandir=/usr/share/man --enable-debug --disable-static --with-pic --with-layout=GNU --with-pear=/usr/share/php --with-libxml-dir=/usr --with-mysql-sock=/var/run/mysqld/mysqld.sock --enable-dtrace --without-mm --with-mysql=shared,/usr --with-mysqli=shared,/usr/bin/mysql_config --enable-pdo=shared --without-pdo-dblib --with-pdo-mysql=shared,/usr CFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security -O2 -Wall -fsigned-char -fno-strict-aliasing -g" LDFLAGS="-Wl,-z,relro" CPPFLAGS="-D_FORTIFY_SOURCE=2" CXXFLAGS="-g -O2 -fstack-protector-strong -Wformat -Werror=format-security"
  5. Run make && make test
  6. Walk away
  7. Run sapi/cli/php -i and confirm the version and presence of libmysqlclient

Run your test again. Any better?

bishop
  • 37,830
  • 11
  • 104
  • 139
  • Original post updated, there was quite a few steps missing in your compilation guide but it worked in the end. I belive that it was proven that problem is in mysqli_fetch_assoc() warper function and not in mysql driver, libmysqlclient performs significantly worse, and it did worked much faster before. Cleaned up old comments since they are no linger relevant – Zmrzka Jul 30 '16 at 21:41
  • For what it's worth, mysqlnd was co-developed by Ulf Wendel, an engineer who worked for MySQL AB and now works for Oracle. http://blog.ulf-wendel.de/2007/php-mysqlnd-to-support-pdo-mysqlnd-in-php-53/ – Bill Karwin Jul 31 '16 at 06:51
  • Ok, now for a baseline, if you perform the same fetch using the command line `mysql` client, how long does it take? – bishop Jul 31 '16 at 18:16
  • curently I cant test it since data in relevant table are gone for now (reasons not related to this issue) I did tests on different table(bit smaller tho) as intemediate step and performance is same on both hosts. Its cca 30% faster than PHP5.X version. once primary test data are recovered (probably tomorow) I will update original post when I can use same data samples. – Zmrzka Aug 01 '16 at 16:36