4

I'm trying to figure out how to create a script which calculates a standard deviation for a file. As an example, say I DLed a csv with a list of values on it. I want to find the SD of these values by running a python program. We are not using numpy here!

whuan0319
  • 61
  • 1
  • 1
  • 5

6 Answers6

9

If you allow the use of the standard library,

import math

xs = [0.5,0.7,0.3,0.2]     # values (must be floats!)
mean = sum(xs) / len(xs)   # mean
var  = sum(pow(x-mean,2) for x in xs) / len(xs)  # variance
std  = math.sqrt(var)  # standard deviation

If not, you need to approximate sqrt by hand. For example, you can use binary search or Newton's Method. Here's a wikipedia page for methods of doing so

duckworthd
  • 14,679
  • 16
  • 53
  • 68
  • ahhh well its because I am importing data which is messy in the sense that it is not given to me in a straight list. It has commas, spaces, and other numbers which i don't care about. I am only interested in one column. I have to change it so that it is the optimum list where we can easily perform the functions you've written. Figuring how to even import it has been a struggle – whuan0319 Jun 05 '14 at 00:47
4

with Python 3.4 and above there is a package called statistics, that has standard deviation (pstdev) and other functions

Here is an example of how to use it:

import statistics

data = [1, 1, 2.5, 6.5, 7.3, 8, 9.2]

print(statistics.pstdev(data))
# 3.2159043543498815
Vlad Bezden
  • 83,883
  • 25
  • 248
  • 179
1
 from math import sqrt
 n= [11, 8, 8, 3, 4, 4, 5, 6, 6, 7, 8] 

 mean =sum(n)/len(n)
 SUM= 0
 for i in n :
     SUM +=(i-mean)**2



 stdeV = sqrt(SUM/(len(n)-1)) 
 print(stdeV)
reyad
  • 11
  • 2
0
filename = "C:\Users\mmb0368\Desktop\input.txt"
file = open("C:\Users\mmb0368\Desktop\input.txt","rb")

n = file.readlines()

num_list = map(lambda n: n.rstrip("\n"), n)
num_list = [int(x) for x in num_list]
mean = sum(num_list)/len(num_list)
print mean, max(num_list), min(num_list)

for snDev in num_list:

    snDev = mean**(1.0/2)
print snDev
0
from math import sqrt

def getAverage(mylist):
    """
    This function calculates the average of a list of numbers.

    Parameters:
    mylist (list): List of numbers

    Returns:
    float: Average of the numbers in the list

    Example:
    >>> getAverage([1,5,10])
    5.333333333333333
    """
    return sum(mylist)/len(mylist)

def getStandardDeviation(mylist):
    """
    This function calculates the standard deviation of a list of numbers.

    Parameters:
    mylist (list): List of numbers

    Returns:
    float: Standard deviation of the numbers in the list

    Example:
    >>> getStandardDeviation([1,5,10])
    4.509249752822894
    """
    ls=[]
    for i in mylist:
        ls.append((i - getAverage(mylist))**2)
    return sqrt( sum(ls) / (len(mylist) - 1) )

mylist = [1,5,10]

getAverage(mylist=mylist)
# 5.333333333333333

getStandardDeviation(mylist=mylist)
# 4.509249752822894

This code contains two functions getAverage and getStandardDeviation for calculating average and standard deviation of a list of numbers respectively. The getAverage function takes in a list of numbers and returns the average of those numbers. The getStandardDeviation function takes in a list of numbers and returns the standard deviation of those numbers by first finding the square difference of each number from the average and then taking the square root of the average of those squared differences. A sample list mylist of numbers is defined at the end and both functions are called with this list as argument.

Ego
  • 1
  • 1
0
def calculateSD(self, nums):
    n = len(nums)
    mean = sum(nums) // n
    variance = sum((x - mean) ** 2 for x in nums) / (n - 1)

    stdev = variance ** 0.5
    print(stdev)
  • 1
    Remember that Stack Overflow isn't just intended to solve the immediate problem, but also to help future readers find solutions to similar problems, which requires understanding the underlying code. This is especially important for members of our community who are beginners, and not familiar with the syntax. Given that, **can you [edit] your answer to include an explanation of what you're doing** and why you believe it is the best approach? – Jeremy Caney May 06 '23 at 00:03