Questions tagged [standard-deviation]

Standard deviation (represented by the symbol sigma, σ) shows how much variation or "dispersion" exists from the average (mean, or expected value).

The standard deviation of a random variable, statistical population, data set, or probability distribution is the square root of its . A standard deviation close to 0 indicates that the data points tend to be very close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the data points are spread out over a wider range of values.

The standard deviation of X is the quantity

enter image description here

870 questions
2
votes
1 answer

Multiple filter arguments dplyr

I want to filter out multiple data errors in a huge (>20 000 points) dataset. Here is a pretend dataset (EDIT: I simplified it significantly): data<-data.table(age=c(1,1,1,2,2,2,3,3,4,4,4,4,4,4),wt=c(32,12,5,32,80,32,1,0,4,8,1,1,2,50)) In this…
Blundering Ecologist
  • 1,199
  • 2
  • 14
  • 38
2
votes
3 answers

Removing irrelevant values (end tail) from (non)normal distribution array of numbers

While I appreciate this question is math-heavy, the real answer for this question will be helpful for all those, who are dealing with MongoDB's $bucket operator (or its SQL analogies), and building cluster/heatmap chart data. Long Description of the…
AlexZeDim
  • 3,520
  • 2
  • 28
  • 64
2
votes
2 answers

How can I calculate the sd? Error in as.double(x): cannot coerce type 'S4' to vector of type 'double'

Do somebody know what is wrong with my code? I edited the post, because i didn´t give you the data. I want to calculate the sd. The calculation of the mean worked. Here is the link to the cropped…
2
votes
2 answers

Line Plot with standard deviation

I want to create a line chart, with two lines that shows the standard deviation for each for each line. At the moment, I have a line chart, that shows the two lines. My Code is this, Categories is the name for the x-axes, Result 1/2 are the Results…
Jonas Aust
  • 27
  • 4
2
votes
1 answer

find the best combination from two lists

I have a two different lists. One holds the values of the mean that was calculated from other sublists and another list which holds the standard deviation. Is there a way to find the best combination?Meaning to find the highest mean with the lowest…
piggy
  • 115
  • 10
2
votes
4 answers

Calculating mean and standard deviation and ignoring 0 values

I have a list of lists with sublists all of which contain float values. For example the one below has 2 lists with sublists each: mylist = [[[2.67, 2.67, 0.0, 0.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67, 2.67, 2.0, 2.0], [2.67,…
piggy
  • 115
  • 10
2
votes
1 answer

How do I incorporate SE in place of SD in my bar chart error bars? Also, how do I change the order of my x-axis groups

I have created a bar chart displaying proportion of time spent on different behaviours for groups of lemurs. However I am placed with two problems. 1) I had hoped to use standard error bars in place of my standard deviation bars. I am unsure in how…
2
votes
0 answers

How to make 2 array with the same lenght and similar standard deviation between the index/ value?

Requirement: I would like to make a function in python which is given 2 arrays, 1 bigger than the second one. I would like the bigger one to take the same size that the smaller one. BUT not only removing the end or beginning of the array I would…
SylwekFr
  • 308
  • 3
  • 21
2
votes
2 answers

Why does dnorm() not return the standard deviation I inputted when I do sd(dnorm())?

This may be a dumb question, however I don't understand why sd(dnorm(1:100, mean=50, sd=15)) doesn't return the standard deviation as [1] 15.0 instead of what it actually returns which is [1] 0.009440673. When I do this with rnorm() sd(rnorm(100,…
2
votes
1 answer

How to calculate standard deviation every 3 columns in a dataframe?

I have a dataframe with 4895 rows and 75 columns. I need to calculate the standard deviation each 3 columns, for each row. So at the and I should have 4895 rows and 25 columns (75/3), where each columns is the SD calculated among three columns. This…
Antonio Manco
  • 217
  • 2
  • 14
2
votes
2 answers

Standard deviation calculation - Am I following the right approach? How do we find SD percentage?

I have a query which gives me below data. item_name, total_purchase_count_per_week, previous_day_purchase_count. For ex, iPhone , 4800, 200 Samsung, 3000, 470 Moto, 1700, 80 Now, I'm interested in knowing how much percentage yesterday's purchase…
CuriousToLearn
  • 153
  • 1
  • 12
2
votes
2 answers

Normalizing the columns of a dataframe

I want to normalize the column in the following dataframe: import pandas as pd from pprint import pprint d = {'A': [1,0,3,0], 'B':[2,0,1,0], 'C':[0,0,8,0], 'D':[1,0,0,1]} df = pd.DataFrame(data=d) df = (df - df.mean())/df.std() I am not sure if the…
Natasha
  • 1,111
  • 5
  • 28
  • 66
2
votes
2 answers

If I have a large list of coordinates, how can I extract the y-values that correspond to a specific x-value?

I have three datasets that compile into one big dataset. Data1 has x-values ranging from 0-47 (ordered), with many y-values (a small error) attached to an x-value. In total there are approx 100000 y values. Data 2 and 3 are similar but with…
2
votes
1 answer

Adding a standard error column to my data set so error bars can be plotted

Data <- data.frame(id, consumption, Day, Hour) #The data is a large time series data set with thousands of valued per household id. #eg. consumption <- c(99, 119, 130, 110, 109, 118) etc. #Hour and Day were calculated from the Date Time of the…
EllisR8
  • 169
  • 2
  • 10
2
votes
1 answer

Why do np.std(X) and X.std() return different values?

I am trying to calculate normalized scores for my dataset using mean normalization. When I write (X - np.mean(X))/np.std(X), it gives me different score than doing ((X - X.mean())/X.std(). Problem seems to be coming from calculation of standard…
Matt
  • 79
  • 9