What is the meaning of the parameter bias
of the pandas.core.window.ewm.ExponentialMovingWindow.var
?
The official reference describes it as "Use a standard estimation bias correction."
I think this explanation is unclear. Is it related to unbiased variance? I would like a mathematical explanation with formulas.
kosnik explains how it's calculated, but doesn't explain why this is necessary.
Additionally, I don't understand that this parameter is False by default in this function. Based on Nilesh Ingle's answer, I made a sample code to reproduce pandas calculation result. As you can see from this result, even though bias is false in pandas.ewm.var, to reproduce the result we need to calculate the bias and multiply the result by it.
import numpy as np
import pandas as pd
def calc_ewm_var(array:np.array,alpha):
ewm_var_list=[]
data_len=len(array)
for i in range(data_len):
# Get window
win_data=array[:i+1]
# Calculate exponential moving variance with bias
ewmvar = calc_win_ewm_var(win_data,alpha)
# Calculate standard deviation
ewm_var_list.append(ewmvar)
return np.array(ewm_var_list)
def calc_win_ewm_var(win_data,alpha):
win_len=len(win_data)
weight_arr=(1-alpha)**np.arange(win_len-1,-1,-1)
# Calculate exponential moving average
ewma = np.sum(weight_arr * win_data) / np.sum(weight_arr)
# Calculate bias
bias_denom=(np.sum(weight_arr)**2 - np.sum(weight_arr**2))
if bias_denom==0:
return np.nan
bias = np.sum(weight_arr)**2 / bias_denom
# Calculate exponential moving variance with bias
ewmvar = bias*np.sum(weight_arr * (win_data - ewma)**2) / np.sum(weight_arr)
return ewmvar
l = [12.0, 12.5, 13.1, 14.6, 17.8, 19.1, 24.5]
sz = pd.Series(l)
alpha=0.1
span=2/alpha-1 # the span is 19
ewmv_pd=sz.ewm(alpha=alpha).var() # bias is False by default
ewmv_pd.name="ewmvar_pd"
ewmstd_pd=sz.ewm(alpha=alpha).std()
ewmstd_pd.name="ewmstd_pd"
ewm_pd_data=pd.concat([ewmv_pd,ewmstd_pd],axis=1)
array=sz.values
ewm_var_calced=calc_ewm_var(array,alpha)
ewm_std_calced=np.sqrt(ewm_var_calced)
ewm_pd_data["ewm_var_calced"]=ewm_var_calced
ewm_pd_data["ewm_std_calced"]=ewm_std_calced
ewm_pd_data=ewm_pd_data[["ewmvar_pd","ewm_var_calced","ewmstd_pd","ewm_std_calced"]]
ewm_pd_data