19

I'm creating time-series econometric regression models. The data is stored in a Pandas data frame.

How can I do lagged time-series econometric analysis using Python? I have used Eviews in the past (which is a standalone econometric program i.e. not a Python package). To estimate an OLS equation using Eviews you can write something like:

equation eq1.ls log(usales) c log(usales(-1)) log(price(-1)) tv_spend radio_spend

Note the lagged dependent and lagged price terms. It's these lagged variables which seem to be difficult to handle using Python e.g. using scikit or statmodels (unless I've missed something).

Once I've created a model I'd like to perform tests and use the model to forecast.

I'm not interested in doing ARIMA, Exponential Smoothing, or Holt Winters time-series projections - I'm mainly interested in time-series OLS.

Davide Fiocco
  • 5,350
  • 5
  • 35
  • 72
Steve Maughan
  • 1,174
  • 3
  • 19
  • 30

1 Answers1

27

pandas allows you to shift your data without moving the index such has

df.shift(-1)

will create a 1 index lag behing

or

df.shift(1)

will create a forward lag of 1 index

so if you have a daily time series, you could use df.shift(1) to create a 1 day lag in you values of price such has

df['lagprice'] = df['price'].shift(1)

after that if you want to do OLS you can look at scipy module here :

http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.stats.linregress.html

Steven G
  • 16,244
  • 8
  • 53
  • 77
  • 1
    Thanks - this looks good. Do I need to create all of the lagged series before estimating the model? Or is there a way to create a model which calculates the lagged values as and when needed? – Steve Maughan Oct 03 '16 at 22:14
  • 1
    Ordinary Least Squares regression doesn't need any lag normally. you should just past X and Y values to estimate your Beta that minimise the error. then you can estimate any Y given a new X values. if you need a lagged model. We normally introduce lagged values in ' auto-regressive' model where Xt-1 has something to do with Xt, but that would be a different model – Steven G Oct 03 '16 at 22:22
  • 1
    but let say you want to estimate a beta between Xt-1 and X, you could use scipy and pass df['price'].shift(1) as X and df['price'] as Y, this would calibrate a beta such has Xt = B*Xt-1 – Steven G Oct 03 '16 at 22:23
  • 1
    Thanks, this is helpful. BTW in many econometric time series models a lagged variable is used (as well as a lagged dependent). For example, a price term could be lagged. – Steve Maughan Oct 03 '16 at 23:03
  • 1
    You saved me. Truly. God bless you. All the blessings with you. ;) – RyanC Apr 21 '18 at 02:03