0

I have a python pandas DataFrame with very large columns. For one of the columns (call it column A), I need the fast possible way to make a new column (call it column E) or separate series. The elements of new column E are determined based on the values in column A. In column E, each element is set by reference to the corresponding element of column A and replacing it with the minimum of the element and the k values that preceded it in A. I must iterate this process many times, so speed is my concern.

As an example, if k were equal to 2, and this were my data frame:

   A  B  
 v 10 10  
 w 4  8  
 x 11  1     
 y 9  22  
 z 15 5    

I would want to create the following column E based on column A

    E
 v  10
 w  4
 x  4
 y  4
 z  9

To simplify the problem, I could also except this result:

    E
 v  NaN
 w  NaN
 x  4
 y  4
 z  9
A. Arpi
  • 217
  • 1
  • 2
  • 6

1 Answers1

0

You can use pd.rolling_min and set window=k+1 to calculate the rolling minimum with a lookback window of k+1.

pd.rolling_min(df['A'], window=3)

v   NaN
w   NaN
x     4
y     4
z     9
dtype: float64
Jianxun Li
  • 24,004
  • 10
  • 58
  • 76