20

I am trying to calculate Z-scores using PHP. Essentially, I am looking for the most efficient way to calculate the mean and standard deviation of a data set (PHP array). Any suggestions on how to do this in PHP?

I am trying to do this in the smallest number of steps.

markus
  • 40,136
  • 23
  • 97
  • 142
Spencer
  • 21,348
  • 34
  • 85
  • 121

5 Answers5

35

to calculate the mean you can do:

$mean = array_sum($array)/count($array)

standard deviation is like so:

// Function to calculate square of value - mean
function sd_square($x, $mean) { return pow($x - $mean,2); }

// Function to calculate standard deviation (uses sd_square)    
function sd($array) {
    // square root of sum of squares devided by N-1
    return sqrt(array_sum(array_map("sd_square", $array, array_fill(0,count($array), (array_sum($array) / count($array)) ) ) ) / (count($array)-1) );
}

right off this page

miken32
  • 42,008
  • 16
  • 111
  • 154
Naftali
  • 144,921
  • 39
  • 244
  • 303
  • 1
    The PHP stats_standard_deviation() function executes C++ code and will run much faster than an equivalent function written in PHP code. – geofflee Aug 11 '16 at 06:48
  • 2
    The php "built" in stats_standard_deviation returns a different value from the `sd()` function you present. I checked using Excel STDDEV and it matches yours, but interestingly, the stats_standard_deviation seems to be returning a 95% confidence level... not exactly what Excel CONFIDENCE function returns, but pretty close. – Hank Oct 09 '12 at 12:51
  • That's because stats_standard_deviation() calculates the **population** standard deviation by default, whereas Excel STDEV calculates the **sample** standard deviation. To obtain the same results, you must either use Excel STDEVP or call stats_standard_deviation() with $sample = true. To understand why there's a difference between sample vs population, see [Bessel's correction](https://en.wikipedia.org/wiki/Bessel%27s_correction). – geofflee Aug 11 '16 at 06:47
  • Could it be that you neglected to set $sample=true ? – Dave Burton Jun 22 '15 at 20:23
  • @dankyi-anno-kwaku Stack Overflow is an English-only site. Please be careful not to link to foreign language resources. – miken32 Nov 18 '21 at 00:16
14

How about using the built in statistics package like stats_standard_deviation and stats_harmonic_mean. I can't find a function for standard means, but if you know anything about statistics, I'm sure you can figure something out using the built-in functions.

rockerest
  • 10,412
  • 3
  • 37
  • 67
  • 2
    Have an upvote then :-) Given that this was the best answer ("smallest number of steps" and (probably) "most efficient way"). (Maybe someone was unhappy about saying _built-in_; you have to do "sudo pecl install stats" and then edit php.ini) – Darren Cook Nov 04 '11 at 08:35
  • @DarrenCook I wonder if there's a composer update to this problem. My pain point with pear/pecl has been as an app developer that redistributes, you never know when/if a customer can use pear. Composer is helping change that :-) Just my $0.02. PS - Upvote for you rockerest – Bob Gregor Apr 17 '14 at 20:44
  • 1
    The stats PECL package appears to be no longer maintained since the last update was in June 2016 https://pecl.php.net/package/stats – 8ctopus Nov 24 '20 at 09:28
5
   function standard_deviation($aValues)
{
    $fMean = array_sum($aValues) / count($aValues);
    //print_r($fMean);
    $fVariance = 0.0;
    foreach ($aValues as $i)
    {
        $fVariance += pow($i - $fMean, 2);

    }       
    $size = count($aValues) - 1;
    return (float) sqrt($fVariance)/sqrt($size);
}
1

This is the same std dev code from the php pages mentioned in the top answer, but modernized a bit, without the one-line array magic and separate helper function:

/**
 * Calculates the standard deviation of an array of numbers.
 * @param   array   $array  The array of numbers to calculate the standard deviation of.
 * @return  float   The standard deviation of the numbers in the given array.
 */
function calculateStdDev(array $array): float
{
    $size = count($array);
    $mean = array_sum($array) / $size;
    $squares = array_map(function ($x) use ($mean) {
        return pow($x - $mean, 2);
    }, $array);

    return sqrt(array_sum($squares) / ($size - 1));
}
starbeamrainbowlabs
  • 5,692
  • 8
  • 42
  • 73
Bobo
  • 19
  • 1
-1

The topic is pretty old, but actually I guess you can get the substitute of STDEV.P function from Excel easily using the function like below.

function stdev_p($arr) {
    $arr2 = array();
    $mean = array_sum($arr)/count($arr);
    for ($x = 0; $x <count($arr); $x++) {
        $arr2[$x] = pow($arr[$x] - $mean,2);
    }
    return sqrt(array_sum($arr2)/count($arr));
}