What I got so far is the code below and it works fine and brings the results it should: It fills df['c']
with the calculation previous c * b
if there is no c
given. The problem is that I have to apply this to a bigger data set len(df.index) = ca. 10.000
, so the function I have so far is inappropriate since I would have to write a couple of thousand times: df['c'] = df.apply(func, axis =1)
. A while
loop is no option in pandas
for this size of dataset. Any ideas?
import pandas as pd
import numpy as np
import datetime
randn = np.random.randn
rng = pd.date_range('1/1/2011', periods=10, freq='D')
df = pd.DataFrame({'a': [None] * 10, 'b': [2, 3, 10, 3, 5, 8, 4, 1, 2, 6]},index=rng)
df["c"] =np.NaN
df["c"][0] = 1
df["c"][2] = 3
def func(x):
if pd.notnull(x['c']):
return x['c']
else:
return df.iloc[df.index.get_loc(x.name) - 1]['c'] * x['b']
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)
df['c'] = df.apply(func, axis =1)