0

In an infinite loop, i want to break out based on number of elements in an array. Say:

$myarr = array();

While (True){
    //... do something that modifies $myarr ...
    if (count($myarr) > 100000) { break; }
}

The problem is, every time i try to code this way, thoughts of micro-optimization creeps in my mind(blame me). I tell myself: why not just use a variable to keep track of the number of elements in the array? Like this:

$myarr = array();
$n_myarr = 0;

while (True){
    // ... do something that modifies $myarr
    if ( ... elements added ... )
        { $n_myarr += $n_elements_added; }

    else if ( ... elements removed ... )
        { $n_myarr -= $n_elements_removed; }

    if ($n_myarr > 1000000) { break; }
}

As far as I understand, how count() performs is completely dependent on underlying implementation of count() and array. I always prefer to write in simpler ways, if i can, like the 1st code snippet. Can anyone enlighten me on this subject? Especially, how does count() work under the hood?

Thank you.

-Titon

Titon
  • 139
  • 2
  • 7

3 Answers3

2

After writing a little benchmark script, i think i've found my answer. Here's the code of the script:

<?php

$n_iteration = 1e7;

$test_sizes = array(
    1e2, 1e3, 1e4, 1e5, 1e6, 2e6, 3e6, 4e6, 5e6
);

foreach ($test_sizes as $test_size){
    $test_array = range(1, $test_size);

    $start_time = microtime(true);

    for ($i = 0; $i < $n_iteration; $i++)
        { $x = count($test_array); }

    $end_time = microtime(true);
    $interval = $end_time - $start_time;
    printf(
        "Iterations: %d, Size: %8.d,"
        ." Total time: %6.3f sec, Avg. time: %1.3e sec\n",
        $n_iteration, $test_size, $interval, $interval/$n_iteration);

}

Running the script in my machine with "PHP 5.4.4-2 (cli) (built: Jun 19 2012 07:38:55)" produces the following output:

Iterations: 10000000, Size:      100, Total time:  3.548 sec, Avg. time: 3.548e-7 sec
Iterations: 10000000, Size:     1000, Total time:  3.368 sec, Avg. time: 3.368e-7 sec
Iterations: 10000000, Size:    10000, Total time:  3.549 sec, Avg. time: 3.549e-7 sec
Iterations: 10000000, Size:   100000, Total time:  3.407 sec, Avg. time: 3.407e-7 sec
Iterations: 10000000, Size:  1000000, Total time:  4.557 sec, Avg. time: 4.557e-7 sec
Iterations: 10000000, Size:  2000000, Total time:  3.263 sec, Avg. time: 3.263e-7 sec
Iterations: 10000000, Size:  3000000, Total time:  3.574 sec, Avg. time: 3.574e-7 sec
Iterations: 10000000, Size:  4000000, Total time:  4.047 sec, Avg. time: 4.047e-7 sec
Iterations: 10000000, Size:  5000000, Total time:  3.628 sec, Avg. time: 3.628e-7 sec

As we can see, avg. time spent inside a single count() is approximately constant, around 0.4 microsecond, irrespective of the size of the array.

Conclusion:

PHP itself keeps track of the number of elements in an array in an efficient way(count() has O(1) runtime cost). No need to use extra variables for efficiency.

count() is healthy for both syntactical clarity and efficiency.

Titon
  • 139
  • 2
  • 7
0

If you are seeking faster processing the second code will perform faster, because you're not using the function and if you look for the count function http://bg2.php.net/manual/en/function.count.php you can see that it's implemented from a class and as we all know OOP is slower than the procedural code.

HerpaMoTeH
  • 364
  • 3
  • 13
  • You have a point. But I'm not concerned if the cumulative time spent in count() slows the code slightly due to OOP implementation, rather: It is wise to call count() on an array with millions of elements? What is the "big O performance characteristics" of count() ? – Titon Jul 23 '12 at 08:06
  • OOP is 5 to 10% slower than procedural code. Adding and subtracting from a variable is faster than calling a function every time you enter the loop.I think that it's wise to use count only once - in the initialization of the variable, that shows you the number of the elements. – HerpaMoTeH Jul 23 '12 at 08:41
  • "you can see that it's implemented from a class"? Array is not a class in PHP. The implementation is pure C. –  Jul 23 '12 at 20:01
  • @duskwuff If you had read the description of the function in php.net you would have read the following "The interface has exactly one method, Countable::count(), which returns the return value for the count() function." So yeah it's implented from a class. – HerpaMoTeH Jul 24 '12 at 05:28
  • That's only used when `count()` is used on objects which are instances of SPL classes. Array is not an object; it's a primitive type. –  Jul 24 '12 at 06:31
0

You should cache the result of count(), it's probably not going to make a huge difference, but it's still an easy optimization. Calling count vs caching the result is approximately 4 times slower.

Code

$array = range(0,100000);

for($x = 100; $x <= 1000000; $x += 100) {

    $countResults = [];
    $staticResults = [];

    for($i = 0; $i < $x; $i++) {

        $start = microtime(true);
        for($j = 0; $j < count($array); $j++) {}
        $end = microtime(true);

        $countResults[] = $end-$start;

        $start = microtime(true);
        $size = count($array);
        for($j = 0; $j < $size; $j++) {}
        $end = microtime(true);

        $staticResults[] = $end-$start;

    }

    $countSum = array_sum($countResults);

    echo sprintf(
         "Count  - Iterations: %d; Total Time: %05.6f; Avg time: %05.6f\n",
         $x,
         $countSum,
         $countSum/$x
     );

    $staticSum = array_sum($staticResults);

    echo sprintf(
         "Static - Iterations: %d; Total Time: %05.6f; Avg time: %05.6f\n",
         $x,
         $staticSum,
         $staticSum/$x
     );

}

Results:

Count  - Iterations: 100; Total Time: 0.962752; Avg time: 0.009628
Static - Iterations: 100; Total Time: 0.253768; Avg time: 0.002538
Count  - Iterations: 200; Total Time: 2.258045; Avg time: 0.011290
Static - Iterations: 200; Total Time: 0.579273; Avg time: 0.002896
Count  - Iterations: 300; Total Time: 2.907984; Avg time: 0.009693
Static - Iterations: 300; Total Time: 0.786796; Avg time: 0.002623
Count  - Iterations: 400; Total Time: 3.756074; Avg time: 0.009390
Static - Iterations: 400; Total Time: 1.004253; Avg time: 0.002511
Count  - Iterations: 500; Total Time: 5.086776; Avg time: 0.010174
Static - Iterations: 500; Total Time: 1.363288; Avg time: 0.002727
Count  - Iterations: 600; Total Time: 6.626471; Avg time: 0.011044
Static - Iterations: 600; Total Time: 1.793517; Avg time: 0.002989
Count  - Iterations: 700; Total Time: 6.780818; Avg time: 0.009687
Static - Iterations: 700; Total Time: 1.816578; Avg time: 0.002595
Count  - Iterations: 800; Total Time: 7.640220; Avg time: 0.009550
Static - Iterations: 800; Total Time: 2.026010; Avg time: 0.002533
Count  - Iterations: 900; Total Time: 8.436923; Avg time: 0.009374
Static - Iterations: 900; Total Time: 2.237418; Avg time: 0.002486
Count  - Iterations: 1000; Total Time: 9.483782; Avg time: 0.009484
Static - Iterations: 1000; Total Time: 2.520293; Avg time: 0.002520
Count  - Iterations: 1100; Total Time: 10.492371; Avg time: 0.009539
Static - Iterations: 1100; Total Time: 2.803949; Avg time: 0.002549
Count  - Iterations: 1200; Total Time: 11.305185; Avg time: 0.009421
Static - Iterations: 1200; Total Time: 3.027705; Avg time: 0.002523
Count  - Iterations: 1300; Total Time: 12.249071; Avg time: 0.009422
Static - Iterations: 1300; Total Time: 3.265644; Avg time: 0.002512
Count  - Iterations: 1400; Total Time: 13.166538; Avg time: 0.009405
Static - Iterations: 1400; Total Time: 3.499845; Avg time: 0.002500
Count  - Iterations: 1500; Total Time: 14.204276; Avg time: 0.009470
Static - Iterations: 1500; Total Time: 3.776997; Avg time: 0.002518
Count  - Iterations: 1600; Total Time: 15.280157; Avg time: 0.009550
Static - Iterations: 1600; Total Time: 4.076611; Avg time: 0.002548
Count  - Iterations: 1700; Total Time: 15.938380; Avg time: 0.009376
Static - Iterations: 1700; Total Time: 4.246082; Avg time: 0.002498
Count  - Iterations: 1800; Total Time: 16.967943; Avg time: 0.009427
Static - Iterations: 1800; Total Time: 4.493304; Avg time: 0.002496
Count  - Iterations: 1900; Total Time: 17.870854; Avg time: 0.009406
Static - Iterations: 1900; Total Time: 4.749316; Avg time: 0.002500
Count  - Iterations: 2000; Total Time: 18.900052; Avg time: 0.009450
Static - Iterations: 2000; Total Time: 5.038069; Avg time: 0.002519
Count  - Iterations: 2100; Total Time: 20.487390; Avg time: 0.009756
Static - Iterations: 2100; Total Time: 5.480530; Avg time: 0.002610
Count  - Iterations: 2200; Total Time: 21.328690; Avg time: 0.009695
Static - Iterations: 2200; Total Time: 5.671044; Avg time: 0.002578
Count  - Iterations: 2300; Total Time: 22.270163; Avg time: 0.009683
Static - Iterations: 2300; Total Time: 5.906530; Avg time: 0.002568
Count  - Iterations: 2400; Total Time: 23.392992; Avg time: 0.009747
Static - Iterations: 2400; Total Time: 6.225149; Avg time: 0.002594
Count  - Iterations: 2500; Total Time: 24.346405; Avg time: 0.009739
Static - Iterations: 2500; Total Time: 6.494287; Avg time: 0.002598
Jonathan
  • 2,778
  • 13
  • 23