12

I used to fetch large amount of data using mysql_query then iterating through the result one by one to process the data. Ex:

$mysql_result = mysql_query("select * from user");
while($row = mysql_fetch_array($mysql_result)){
    echo $row['email'] . "\n";
}

Recently I looked at a few framework and realized that they fetched all data to an array in memory and returning the array.

$large_array = $db->fetchAll("select * from user");
foreach($large_array as $user){
    echo $user['email'] . "\n";
}

I would like to know the pros/cons of each method. It appears to me that loading everything in memory is a recipe for disaster if you have a very long list of items. But then again, a coworker told me that the mysql driver would have to put the result set in memory anyway. I'd like to get the opinion of someone who understand that the question is about performance. Please don't comment on the code, I just made it up as an example for the post.

Thanks

trincot
  • 317,000
  • 35
  • 244
  • 286
U0001
  • 575
  • 1
  • 6
  • 21

4 Answers4

4

you're mixing matters.

  • usability, which makes your code WAY smoother with arrays
  • and unoptimized algorithm, when unexperienced programmer tend to load ALL data into script instead of making database to do all the calculations or get data in portions.

So. Frameworks do not fetch all data. They fetch just what programmer wrote.
So, a good programmer would not fetch large amounts of data into array. In these few cases when it really needed, one would use old line-by-line fetching (and every framework provide a method for this). In the all other cases smooth already-in-array fetching should be used.

Please also note that frameworks will never do such things like echoing data right inside of database loop.
Every good framework would use a template to output things, and in this case an array comes extremely handy.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
2

When working with large result sets, I usually go through batches, like this:

$current = 0;
$batchSize = 1000;

while (true) {
  $large_array = $db->fetchAll(sprintf("select * from user limit %s, %s", $current, $batchSize));
  if (sizeof($large_array) == 0) {
    break;
  }

  $current += sizeof($large_array);
  foreach($large_array as $user){
    echo $user['email'] . "\n";
  } 
}

As long as your code doesn't leak memory, you won't have to worry about which method uses more memory. In terms of performance, loading the entire result set in one query probably is faster, though you're likely to hit the memory limit very soon this way, so performance isn't really your problem anymore.

By the way, it's relatively easy to test this yourself, by setting up a script that measures the time (and peak memory) of both snippets. I'd wager they won't differ in time that much.

Peter Kruithof
  • 10,584
  • 6
  • 29
  • 42
  • why not to make just one while loop? – Your Common Sense Aug 31 '11 at 07:54
  • Yes, I could have tested myself but I'm lazy. And I am even more curious about pros opinions on it. For instance, some people here have presented different approaches and explained them. This, to me, is better than a test. – U0001 Sep 01 '11 at 11:53
1

It is true that the MySQL library "normally" fetches all data in the client memory. It usually does so by using mysql_store_result(). You can split up too large queries as shown above with the LIMIT keyword, but there is the risk that data get inconsistent because they might change in-between. You can care for this using locks.

Another approach could be to use mysql_use_result(), which uses more ressources on the server-side and requires to complete the fetching job ASAP.

glglgl
  • 89,107
  • 13
  • 149
  • 217
-2

Just something I learned when it comes to performance: foreach is faster than a while loop. Perhaps you should benchmark your results of each and see which one is faster and less memory intensive. IMHO, I like the latter approach better. But do you really need every single column within the user table? If not, than just define the columns that you need instead of using * to grab them all. Since this will also help with memory and speed as well.

SoLoGHoST
  • 2,673
  • 7
  • 30
  • 51
  • 4
    OMG foreach is faster than a while – Your Common Sense Aug 31 '11 at 07:42
  • yes, when dealing with large arrays of data, use `foreach` when possible instead of `while` loops. http://juliusbeckmann.de/blog/php-foreach-vs-while-vs-for-the-loop-battle.html this has been proven time and time again... – SoLoGHoST Aug 31 '11 at 07:51
  • 3
    the question is not about foreach vs while. And this kind of comparison is useless : you execute a 12sec sql query but you are trying to optimize the 0.1 sec loop... – remi bourgarel Aug 31 '11 at 07:51
  • yeah, suppose your right, just trying to help anyways. Cheers :) – SoLoGHoST Aug 31 '11 at 07:53
  • in fact, this question is about **while+foreach** vs. while. Anyway, remi is right and such a difference doesn't matter anyway – Your Common Sense Aug 31 '11 at 07:57
  • Actually, it's foreach vs. while. But yeah, pointless to bring this up anyways – SoLoGHoST Aug 31 '11 at 08:00
  • you know, this fetchAll() method isn't getting an array out of nowhere with a magic wand. it have to fetch data from database. line by line. surprized? – Your Common Sense Aug 31 '11 at 08:04
  • it's about fetch all the data or fetch by row, one of the method use while, the other foreach, but this is not the subject, read the question. – remi bourgarel Aug 31 '11 at 08:09
  • Yeah, I got it. Thanks, remind me never to try and help here again. – SoLoGHoST Aug 31 '11 at 08:14
  • May be you just need to get more experience first. There is nothing wrong in the intention to help others. But one have to have some experience first, some **own** experience in matters, not just some bunch of links to cite. – Your Common Sense Aug 31 '11 at 08:17
  • I see, then the $fast variable must be much faster than the $slow one. What the? – U0001 Sep 01 '11 at 11:57