Standard deviation in python

Question

This is the spectra I want to analyze. How do I measure the standard deviation excluding the channel where the peak is present? Lets say the peak is present between 30,000m/s and 90,000m/s.

Exclude the portion of data you don't want, then compute the std? What exactly is blocking you ? [ask] — Julien, Apr 12 '21 at 08:21
There is no [tag:python] in this question. Any solutions posted would likely lead to further questions. See [ask] and how to create a [mcve]. — Peter Wood, Apr 12 '21 at 08:22

score 1 · Accepted Answer · answered Apr 12 '21 at 08:32

numpy.std()

For excluding the peak, your going to have to define what you want to consider a peak to be - otherwise you are going to be making a solution for only the curve you present.

If you know: (i) your data oscillates around 0, (ii) and that there are no massive troughs (i.e. very negative mins), (iii) and that it should roughly balance around 0 then you could use that to define a peak as greater than 2x the absolute of the min

list1 = [0,1,2,15,-2,3,-3,5]
list2 = [ent for ent in list1 if ent < 2*abs(min(list1))]

std1 = numpy.std(list1)
std2 = numpy.std(list2)

If your data fails any of (i), (ii) or (iii) then your going to have to perform some filtering. Here's a useful link to get you started to that end: https://ocefpaf.github.io/python4oceanographers/blog/2015/03/16/outlier_detection/

score 0 · Answer 2 · answered Apr 12 '21 at 08:38

Preamble Lidia, this is your first question. Normally this type of question that you posted are not answered on stackoverflow. This is not a service to provide solutions, it is a service to help you find the solution yourself. Thus, next time add not only the problem, but most importantly also what you know and think and did so far to address it -- so that we can actually help you.

Your data is just a series of data (x_i, y_i). Calculate average and variance (https://en.wikipedia.org/wiki/Variance) in a loop according to

<y>=1/N sum_i^N y_i

and

<y2>=1/N sum_i^N y_i**2

and exploiting

Variance = <y2> - <y>**2

as well as

RMS = sqrt(Variance)

# x is list of x-values of your data
# y is list of y-values of your data
sum = 0.
sum2 = 0.
count = 0 
for i in range(len(x)):
  if (x[i]<30000 or x[i]>90000): continue
  count += 1
  sum += y[i]
  sum2 += pow(y[i], 2)

variance = sum2/count - pow(sum/count, 2)
RMS = math.sqrt(variance)

Thank you for acknowledging my question. I tried implementing ths code but I receive an error saying variance = (sum2/count) - (pow(sum/count, 2)) ZeroDivisionError: float division by zero — Lidia, Apr 12 '21 at 09:26
The problem is that `count` remains 0 in your case. Please investigate your values in `x` and `y` and check the logic of this really simple piece of code to find out why this is the case. — Ralf Ulrich, Apr 12 '21 at 14:05

Standard deviation in python

2 Answers2