I'm trying to figure out how to create a script which calculates a standard deviation for a file. As an example, say I DLed a csv with a list of values on it. I want to find the SD of these values by running a python program. We are not using numpy here!
-
So where exactly are you stuck? The formula is quite simple. – NPE Jun 03 '14 at 20:04
6 Answers
If you allow the use of the standard library,
import math
xs = [0.5,0.7,0.3,0.2] # values (must be floats!)
mean = sum(xs) / len(xs) # mean
var = sum(pow(x-mean,2) for x in xs) / len(xs) # variance
std = math.sqrt(var) # standard deviation
If not, you need to approximate sqrt
by hand. For example, you can use binary search or Newton's Method. Here's a wikipedia page for methods of doing so

- 14,679
- 16
- 53
- 68
-
ahhh well its because I am importing data which is messy in the sense that it is not given to me in a straight list. It has commas, spaces, and other numbers which i don't care about. I am only interested in one column. I have to change it so that it is the optimum list where we can easily perform the functions you've written. Figuring how to even import it has been a struggle – whuan0319 Jun 05 '14 at 00:47
with Python 3.4 and above there is a package called statistics, that has standard deviation (pstdev) and other functions
Here is an example of how to use it:
import statistics
data = [1, 1, 2.5, 6.5, 7.3, 8, 9.2]
print(statistics.pstdev(data))
# 3.2159043543498815

- 83,883
- 25
- 248
- 179
from math import sqrt
n= [11, 8, 8, 3, 4, 4, 5, 6, 6, 7, 8]
mean =sum(n)/len(n)
SUM= 0
for i in n :
SUM +=(i-mean)**2
stdeV = sqrt(SUM/(len(n)-1))
print(stdeV)

- 11
- 2
filename = "C:\Users\mmb0368\Desktop\input.txt"
file = open("C:\Users\mmb0368\Desktop\input.txt","rb")
n = file.readlines()
num_list = map(lambda n: n.rstrip("\n"), n)
num_list = [int(x) for x in num_list]
mean = sum(num_list)/len(num_list)
print mean, max(num_list), min(num_list)
for snDev in num_list:
snDev = mean**(1.0/2)
print snDev

- 42
- 1
- 4
from math import sqrt
def getAverage(mylist):
"""
This function calculates the average of a list of numbers.
Parameters:
mylist (list): List of numbers
Returns:
float: Average of the numbers in the list
Example:
>>> getAverage([1,5,10])
5.333333333333333
"""
return sum(mylist)/len(mylist)
def getStandardDeviation(mylist):
"""
This function calculates the standard deviation of a list of numbers.
Parameters:
mylist (list): List of numbers
Returns:
float: Standard deviation of the numbers in the list
Example:
>>> getStandardDeviation([1,5,10])
4.509249752822894
"""
ls=[]
for i in mylist:
ls.append((i - getAverage(mylist))**2)
return sqrt( sum(ls) / (len(mylist) - 1) )
mylist = [1,5,10]
getAverage(mylist=mylist)
# 5.333333333333333
getStandardDeviation(mylist=mylist)
# 4.509249752822894
This code contains two functions getAverage
and getStandardDeviation
for calculating average and standard deviation of a list of numbers respectively. The getAverage
function takes in a list of numbers and returns the average of those numbers. The getStandardDeviation
function takes in a list of numbers and returns the standard deviation of those numbers by first finding the square difference of each number from the average and then taking the square root of the average of those squared differences. A sample list mylist
of numbers is defined at the end and both functions are called with this list as argument.

- 1
- 1
-
1
-
@Ego code only answer is not ideal, please add details to improve your answer's clarity – NanoPish Jan 26 '23 at 11:27
def calculateSD(self, nums):
n = len(nums)
mean = sum(nums) // n
variance = sum((x - mean) ** 2 for x in nums) / (n - 1)
stdev = variance ** 0.5
print(stdev)
-
1Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney May 06 '23 at 00:03