0

I have a data frame df with 120 columns named asA,B,C,Detc. Lets say A looks like this:

A
31
12
51
72
81
..

I want to create a column SDA that will have a standard deviation of the elements of A in such a way that the first value of SDA will be the standard deviation of the first value of A, the second value of SDA will be the Standard Deviation of first two value of A.The third value of SDA will be the standard deviation of initial 3 values of column A etc.

SDA
x1
x2
x3
x4
x5 
...

here x1 will calculate SD of 31 (SD value may be 0, as only one value ), x2 will calculate SD of 31,12. Then the next one will consider 31,12,51. Next 31,12,51,71 and so on.

How can I carry out such an operation and generate such a Standard deviation column for every column (that will result in generating many SD columns at a time I have in df at the same time?

Jewel_R
  • 126
  • 2
  • 17
  • Does this answer your question? [Rolling and cumulative standard deviation in a Python dataframe](https://stackoverflow.com/questions/44879517/rolling-and-cumulative-standard-deviation-in-a-python-dataframe) – ddejohn Sep 06 '21 at 17:56

1 Answers1

2

Try with expanding:

df["SDA"] = df.expanding().std().fillna(0)

>>> df
    A        SDA
0  31          0
1  12  13.435029
2  51  19.502137
3  72  25.826343
4  81  28.500877

To apply this to all columns, you could do:

output = df.join(df.expanding().std().fillna(0).add_prefix("SD"))

To apply this to only the first 10 columns:

output = df.join(df.iloc[:,:10].expanding().std().fillna(0).add_prefix("SD"))
not_speshal
  • 22,093
  • 2
  • 15
  • 30
  • That is helpful. Thanks. If I want to use this expanding only to certain columns say first 10 columns how can I use this ```output = df.join(df.expanding().std().fillna(0).add_prefix("SD"))``` command? Also isnt this ```df["SDA"] = df.expanding().std().fillna(0)``` command apply to all elements of ```df```. do I use it like this? ```df["SDA"] = df['A'].expanding().std().fillna(0)``` – Jewel_R Sep 06 '21 at 18:27