3

I would like to use formulas to specify a "baseline" model for some models fitting using statsmodels For example, I'd like to be able to specify a formula to pass to a olm or Logit model that simply predicts the mean of the observed dependent variable for all observations. I know that I can get these numbers simply by calculating the mean of the observations for the dependent variable, but I would like to have a model that produces these results (e.g. so I can use its methods). Is there a patsy syntax for accomplishing this?

orome
  • 45,163
  • 57
  • 202
  • 418

1 Answers1

3

If you use a formula with only the intercept term, then you will get the mean/average of the dependent variable:

import statsmodels.formula.api as smf

data={'y': [1,5,9],                       # mean(y) == 5 
      'X': [[2013], [0.001], [19.99]]     # doesn't matter
      }
model = smf.ols('y ~ 1', data=data).fit()
model.predict(3.14)                       # ==> 5
elyase
  • 39,479
  • 12
  • 112
  • 119
  • I'm surprised that the result of `predict` is not the same dimension as its argument. That's not how I'd expect it to work. – orome Mar 22 '14 at 03:04
  • I've got [another question](http://stackoverflow.com/q/22580477/656912) related to the (odd) way in which `predict` returns results. – orome Mar 22 '14 at 16:46