1

I'm attempting to translate R code into Python and running into trouble trying to replicate the R lm{stats} function which contains 'weights', allowing for weights to be used in the fitting process.

My ultimate goal is to simply run a weighted linear regression in Python using the statsmodels library.

Searching through the Statsmodels issues I've located caseweights in linear models #743 and SUMM/ENH rare events, unbalanced sample, matching, weights #2701 which make me think this may not be possible with Statsmodels.

Is it possible to add weights to GLM models in Statsmodels or alternatively, is there a better way to run a weighted linear regression in python?

BarclayK
  • 85
  • 2
  • 10

1 Answers1

1

WLS has weights for the linear model, where weights are interpreted as inverse variance for the result statistics. http://www.statsmodels.org/stable/generated/statsmodels.regression.linear_model.WLS.html

The unreleased version of statsmodels has frequency weights for GLM, but no variance weights. see freq_weights in http://www.statsmodels.org/dev/generated/statsmodels.genmod.generalized_linear_model.GLM.html

(There are many open issues to expand the types of weights and adding weights to other models, but those are not available yet.)

Josef
  • 21,998
  • 3
  • 54
  • 67
  • Sounds like a weighted linear regression model is actually fairly difficult to do (without rolling your own) in python vs. R. Would you happen to know if there any other libraries for weighted regressions in python outside of Statsmodels or Scikit ? – BarclayK Nov 30 '16 at 22:52
  • What's difficult about WLS? It has the same pattern as OLS except that you provide an additional 1-D weight array. – Josef Nov 30 '16 at 23:26