0

What is the meaning of the parameter bias of the pandas.core.window.ewm.ExponentialMovingWindow.var? The official reference describes it as "Use a standard estimation bias correction."

I think this explanation is unclear. Is it related to unbiased variance? I would like a mathematical explanation with formulas.

kosnik explains how it's calculated, but doesn't explain why this is necessary.

Additionally, I don't understand that this parameter is False by default in this function. Based on Nilesh Ingle's answer, I made a sample code to reproduce pandas calculation result. As you can see from this result, even though bias is false in pandas.ewm.var, to reproduce the result we need to calculate the bias and multiply the result by it.

import numpy as np
import pandas as pd

def calc_ewm_var(array:np.array,alpha):
    ewm_var_list=[]

    data_len=len(array)
    for i in range(data_len):
        # Get window
        win_data=array[:i+1]

        # Calculate exponential moving variance with bias
        ewmvar = calc_win_ewm_var(win_data,alpha)

        # Calculate standard deviation
        ewm_var_list.append(ewmvar)
       
    return np.array(ewm_var_list)

def calc_win_ewm_var(win_data,alpha):
    win_len=len(win_data)
    weight_arr=(1-alpha)**np.arange(win_len-1,-1,-1)

    # Calculate exponential moving average
    ewma = np.sum(weight_arr * win_data) / np.sum(weight_arr)

    # Calculate bias
    bias_denom=(np.sum(weight_arr)**2 - np.sum(weight_arr**2))
    if bias_denom==0:
        return np.nan
    bias = np.sum(weight_arr)**2 / bias_denom

    # Calculate exponential moving variance with bias
    ewmvar = bias*np.sum(weight_arr * (win_data - ewma)**2) / np.sum(weight_arr)    
    return ewmvar

l = [12.0, 12.5, 13.1, 14.6, 17.8, 19.1, 24.5]
sz = pd.Series(l)
alpha=0.1
span=2/alpha-1 # the span is 19

ewmv_pd=sz.ewm(alpha=alpha).var() # bias is False by default
ewmv_pd.name="ewmvar_pd"
ewmstd_pd=sz.ewm(alpha=alpha).std()
ewmstd_pd.name="ewmstd_pd"
ewm_pd_data=pd.concat([ewmv_pd,ewmstd_pd],axis=1)

array=sz.values
ewm_var_calced=calc_ewm_var(array,alpha)
ewm_std_calced=np.sqrt(ewm_var_calced)

ewm_pd_data["ewm_var_calced"]=ewm_var_calced
ewm_pd_data["ewm_std_calced"]=ewm_std_calced
ewm_pd_data=ewm_pd_data[["ewmvar_pd","ewm_var_calced","ewmstd_pd","ewm_std_calced"]]
ewm_pd_data

pandas ewm var reproduction

SolKul
  • 21
  • 3

0 Answers0