-1

I have a script which calls this function more than 100k times, so I am looking for anyway to squeeze a bit more performance out of it.

Can you suggest optimisations or an alternate method for calculating standard deviation in PHP?

function calcStandardDev($samples){


    $sample_count = count($samples);

    for ($current_sample = 0; $sample_count > $current_sample; ++$current_sample) $sample_square[$current_sample] = pow($samples[$current_sample], 2);

    return sqrt(array_sum($sample_square) / $sample_count - pow((array_sum($samples) / $sample_count), 2));

}
Andrew Hall
  • 3,058
  • 21
  • 30

3 Answers3

1
$samples[$current_sample] * $samples[$current_sample]

is going to be faster than

pow($samples[$current_sample], 2)

because it doesn't have the overhead of the function call.

Then you can also simplify

pow((array_sum($samples) / $sample_count), 2));

to prevent calling the pow() function again

To avoid array_sum($samples) being called twice as a result of that change, calculate it once and store to a var before the loop, then just reference that var in the formula.

EDIT

function calcStandardDev($samples){
    $sample_count = count($samples);
    $sumSamples = array_sum($samples);

    for ($current_sample = 0; $sample_count > $current_sample; ++$current_sample)
        $sample_square[$current_sample] = $samples[$current_sample] * $samples[$current_sample];


    return sqrt(array_sum($sample_square) / $sample_count - ( ($sumSamples / $sample_count) *
                                                              ($sumSamples / $sample_count)
                                                            )
               );

}
Mark Baker
  • 209,507
  • 32
  • 346
  • 385
0

Replace both call to array_sum by calculating the respective values yourself. That way you just walk through your array one time instead of three times.

function calcStandardDev($samples){

    $sample_count = count($samples);
    $sum = 0;
    $sum_sqaure = 0;

    for ($current_sample = 0; $sample_count > $current_sample; ++$current_sample) {
        $sum_square += pow($samples[$current_sample], 2);
        $sum += $samples[$current_sample];
    }

    return sqrt( $sum_square / $sample_count - pow( $sum / $sample_count, 2));
}
Sirko
  • 72,589
  • 19
  • 149
  • 183
0

foreach by referance is faster than for, an you already have a loop, you can calculate "sum" in this loop. and $x*$x is so faster then pow($x,2); there are some functions comparations. hope to help.

Your Function microtime = ~ 0.526

Second Function = ~ 0.290

  <?php
    function calcStandardDev($samples)
    {


        $sample_count = count($samples);

        for ($current_sample = 0; $sample_count > $current_sample; ++$current_sample) 
            $sample_square[$current_sample] = pow($samples[$current_sample], 2);

        return sqrt(array_sum($sample_square) / $sample_count - pow((array_sum($samples) / $sample_count), 2));

    }

    function calcStandardDev2($samples)
    {
        $sample_count = count($samples);

        $sum_sample_square  = 0;
        $sum_sample         = 0;

        foreach ($samples as &$sample) 
        {
            $sum_sample         += $sample;
            $sum_sample_square  += $sample * $sample; 
        }

        return sqrt($sum_sample_square / $sample_count - pow($sum_sample / $sample_count,2));

    }

     function calcStandardDev3($samples)
    {
        $sample_count = count($samples);

        $sum_sample_square  = 0;
        $sum_sample         = 0;

        foreach ($samples as &$sample) 
        {
            $sum_sample         += $sample;
            $sum_sample_square  += pow($sample ,2); 
        }

        return sqrt($sum_sample_square / $sample_count - pow($sum_sample  / $sample_count,2));

    }

    echo "<pre>";
    $samples = range(2,100000);

    $start  = microtime(true);
    echo calcStandardDev($samples)."\r\n";
    $end  = microtime(true);
    echo $end - $start ."\r\n";  
    echo "-------\r\n";

    $start  = microtime(true);
    echo calcStandardDev2($samples)."\r\n";
    $end  = microtime(true);
    echo $end - $start."\r\n";
    echo "-------\r\n";

    $start  = microtime(true);
    echo calcStandardDev3($samples)."\r\n";
    $end  = microtime(true);
    echo $end - $start;
    echo "-------\r\n";
?>
  • Thanks! Out of all solutions - the one posted as calcStandardDev2() is the quickest – Andrew Hall Mar 05 '12 at 14:03
  • We already worked out version 2 is the quickest, but in the return, you still have a pow(), would this not be quicker to replace the call to pow in the return line? My tests say no, but the difference is minimal - thoughts? – Andrew Hall Mar 11 '12 at 16:38