0

So, I have data which looks like this:

 BatchID    UnitID  Score  Median
0    A123  A123-100  0.111  0.1065
1    A123  A123-101  0.121  0.1065
2    A123  A123-102  0.101  0.1065
3    A123  A123-103  0.102  0.1065
4    B456  B456-200  0.211  0.2160
5    B456  B456-201  0.221  0.2160
6    C789  C789-001  0.199  0.1955
7    C789  C789-002  0.189  0.1955
8    C789  C789-003  0.192  0.1955
9    C789  C789-004  0.201  0.1955

Each Unit (UnitID) has a score and belongs to a Batch (Batch ID). Originally, this table did not have the "Median" column, but I used df['Median'] = df.groupby('BatchID')['Score'].transform('median') to create it.

Now I want a new column, called 'R-Sigma', in which I apply this Robust Sigma formula to each value:

RS = IQR/1.349

I don't know how to work with the IQR function, which is my first problem, as well as how to apply this calculation to each value.

Finally, I would like an additional 2 columns, one called 'Upper Limit' and one called 'Lower Limit', in which Median +/- 6 * Robust Sigma is calculated, respectively.

How could I do this? I am completely lost.

bharatk
  • 4,202
  • 5
  • 16
  • 30
azura
  • 81
  • 5
  • What does IQR mean? You have to show the formula how it is solved, using the values within the dataframe. If there's an outside variable, you might want to compute and store that too. – Joe Aug 01 '19 at 09:51
  • scipy.stats.iqr , InterQuartile Range. I just don't get how to apply it and essentially the rest of what I asked – azura Aug 01 '19 at 10:12
  • Does it takes any parameter? Like `scipy.stats.iqr('parameter_here')`. If so, if you need that parameter to be one of the values within a specific row on a given column from your DataFrame, you might need to loop for it. – Joe Aug 01 '19 at 10:19

0 Answers0