4

I am doing this assignment where I am trying to run this program 5000 times and do an AR(1) and AR(2) fit to the model. First I defined a function that generated a time series as follows:

def ts_gen_ar1(size,sigma,alpha1):
    wt = np.random.normal(0,sigma**2,size=size)
    x = np.zeros(size)
    for i in np.arange(1,size):
        x[i] = 0.2 + alpha1*x[i-1] + wt[i]
return x

Then i executed the following statements thats taking extremely long time to work

sample_ar1 = []
sample_ar2 = []
for i in range(0,5000):
    rt = ts_gen_ar1(2500,1,0.8)
    coeff_ar1 = sm.tsa.ARMA(rt,order=(1,0)).fit().params[1]
    coeff_ar2 = sm.tsa.ARMA(rt,order=(2,0)).fit().params[1:]
    sample_ar1.append(coeff_ar1)
    sample_ar2.append(coeff_ar2)

can someone suggest how to speed this up? I am also getting fitting errors where my program says MLE failed to converge for certain iterations.

Thanks

Josef
  • 21,998
  • 3
  • 54
  • 67
Rajan S.
  • 79
  • 1
  • 8
  • which version of statsmodels are you using? – unique_beast Feb 19 '15 at 20:57
  • You probably want to follow suggestions in the answer, but if you *really* need fault safe ARMA estimation, you can see my solution to this [here](https://github.com/statsmodels/statsmodels/blob/master/statsmodels/tsa/stattools.py#L935) – jseabold Feb 19 '15 at 21:00
  • Hello Mr. jseabold. Thanks, i have been following your posts through here. I completely missed that function. I will try it right away and see if it helps. I still need to speed the iteration though...as it is taking roughly 15-20 minutes running both the fitting algorithms. But, if i split the process in different cells and execute one by one, then it takes roughly 6 minutes each. I will try to follow the suggestion in the answers – Rajan S. Feb 20 '15 at 00:29

1 Answers1

2

Recursive loops for time series analysis in plain Python are slow.

The easiest solution in this case for generating the sample is to use scipy.signal.lfilter or the wrapper for it in statsmodels arma_generate_sample http://statsmodels.sourceforge.net/devel/generated/statsmodels.tsa.arima_process.arma_generate_sample.html

Another possibility to speed up the random number generation is to run it vectorized for many samples, e.g. run it over blocks of 100 processes. You still have the time loop, but you can reduce the number of replication loops at the cost of using more memory.

sm.tsa.ARMA uses a Kalman Filter written in cython and runs fast, but it is for a general ARMA process which does more work than is needed for estimating a AR model. sm.ts.AR estimates the parameters of a AR process by full maximum likelihood under stationarity assumption (by default).

The simplest and fastest is to estimate the AR process by OLS, which doesn't require non-linear optimization, or use Yule-Walker statsmodels.sourceforge.net/devel/generated/statsmodels.regression.linear_model.yule_walker.html

The last two can estimated the parameters conditional on the initial observations, but will not have the full post-estimation features, like forecasting, that the AR and ARMA models have.

Josef
  • 21,998
  • 3
  • 54
  • 67