1

I'm analysing a social survey and need to use survey package to account for oversampling. I performed a log-linear regression with svyglm command and everything worked perfectly fine. However, the output is an svyglm object that apparently does not include the R-squared value unlike normal lm objects. So how do I get this value and how do I include it in my regression table if its not part of the actual object? (I'm using stargazer package to create the tables for my paper)

Thanks in advance :)

Phil
  • 7,287
  • 3
  • 36
  • 66
Moritary
  • 21
  • 3
  • There's no R-squared for GLM's, report AIC/BIC instead. Also read: https://stats.stackexchange.com/questions/3559/which-pseudo-r2-measure-is-the-one-to-report-for-logistic-regression-cox-s – jay.sf Oct 02 '22 at 14:16
  • You can also use pseudo R-squared measures for GLM https://stats.stackexchange.com/questions/11676/pseudo-r-squared-formula-for-glms – GRowInG Oct 02 '22 at 14:58
  • Question is probably insanely stupid, but why is there no R-squared for GLM? @GRowInG I tried to calculate pseudo R-squared with `psrsq` but it says "Only implemented for discrete data".. – Moritary Oct 02 '22 at 15:04
  • That question is not stupid at all, have a look at the first answer here https://stackoverflow.com/questions/26541899/why-doesnt-statsmodels-glm-have-r2-in-results – GRowInG Oct 02 '22 at 15:11
  • €GRowInG Thanks for the link. But do you have any idea, how I can calculate pseudo R-squared? I'm getting the error "Only implemented for discrete data", since apparently `psrsq()` only works with binomial or poisson models, according to this link: https://rdrr.io/cran/survey/src/R/rsquared.R – Moritary Oct 02 '22 at 18:51

1 Answers1

1

If you have a linear regression model with svyglm you can get the residual variance and the total variance directly, you don't need a pseudo-rsquared

total_var <-svyvar(y, design)
resid_var <- summary(model)$dispersion
rsq <- 1-resid_var/total_var

Thomas Lumley
  • 1,893
  • 5
  • 8
  • Thanks a lot! This should actually do the trick. However I'm actually getting really weird values out of this. I have very low coefficients (simple linear model with log transformed dependent variable and the coefficient is only 0.016750) but the r-squared calculated is 0.994. Any idea how this is possible? – Moritary Oct 06 '22 at 12:29