-1

So, I have a list of data points where all of them belong to a cluster(Each item is a numpy array with 3 features(represnting a point)). I compute their centroid (mean of the points). I want to calculate the standard deviation of a point from the centroid. To put it more precisely, I want to find out how many standard deviations away is a point from the centroid of the cluster. Please help me in coding it.

My list of data points looks something like this

([-5.75204079 8.78545302 8.00800119],....)

1 Answers1

1

Assuming data points in a cluster are stored in a list called data, the following code will calculate standard deviation of that set of data.

# Calculate mean
mean = sum(data)/len(data)

# Calculate sum of square of difference
# of data points from mean
dev = 0
for rec in data:
    dev += pow((rec - mean),2)

# Calculate variance
var = dev/len(data)

# Calculate standard deviation
std_dev = math.sqrt(var)
Supratim Haldar
  • 2,376
  • 3
  • 16
  • 26