8

I have a list:

s = [0.995537725, 0.994532199, 0.996027983, 0.999891383, 1.004754272, 1.003870012, 0.999888944, 0.994438078, 0.992548715, 0.998344545, 1.004504764, 1.00883411]

where I calculated its standard deviation in Excel, I got the answer: 0.005106477, the function I used was: =STDEV(C5:N5)

Then I do the same calculation using numpy.std as:

import numpy as np

print np.std(s)

However, I got the answer: 0.0048890791894

I even wrote up my own std function:

def std(input_list):
        count = len(input_list)

        mean = float(sum(input_list)) / float(count)

        overall = 0.0
        for i in input_list:
            overall = overall + (i - mean) * (i - mean)

        return math.sqrt(overall / count)

and my own function gives the same result as numpy.

So I am wondering is there such a difference? Or it just I made some mistake?

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
ChangeMyName
  • 7,018
  • 14
  • 56
  • 93

1 Answers1

20

There's a difference: Excel's STDEV calculates the sample standard deviation, while NumPy's std calculates the population standard deviation by default (it is behaving like Excel's STDEVP).

To make NumPy's std function behave like Excel's STDEV, pass in the value ddof=1:

>>> np.std(s, ddof=1)
0.0051064766704396617

This calculates the standard deviation of s using the sample variance (i.e. dividing by n-1 rather than n.)

Alex Riley
  • 169,130
  • 45
  • 262
  • 238
  • I've always found those Excel names misleading and confusing... STDEV calculates an estimate of the population's standard deviation based on the sample, while STDEVP calculates the sample's standard deviation, which is the population's if the sample is the whole population. – Jaime Dec 07 '15 at 19:27
  • 1
    Another way of doing this after python3.4 is using the statistics module included in the standard library. You have statistics.stdev() for the sample standard deviation and statistics.pstdev() for the population standard deviation. (This do not require to install any external library such as numpy and is more similar to excel terminology) – Sebastià Serra Rigo Jun 20 '19 at 10:35