GLM - No R-squared output when running simple linear regression with categorical predictor

Question

I am running a simple linear regression with a numerical response (wellbeing) and a categorical explanatory (education) variable. I know that there are ideas about dealing with the categorical variable as continuous, but in this case I want to keep treating it as a factor.

Now...

When I want to assess the quantity of this model with R-squared, the glance functionality of the broom package doesn't provide me with the metric.

In my understanding, the null model here, is the mean of the response variable and the linear model that I've created here is the response variable mapped onto the explanatory variable. There must be some kind of effect size to gauge here.

What do you think? Why can't I get R-squared and would there be another kind of effect size that would tell me something about the improvement of the model by including this categorical predictor.

df <- tibble(education = c("Low", "Medium", "High", "Low", "Medium", "High", "High"),
             wellbeing = c(7, 6, 7, 4, 5, 4, 5))
df$education <- as.factor(df$education)

mdl <- glm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
)

library(dplyr)
library(broom)
mdl_scgeluk_min_havovwombo %>%
  glance() %>%
  pull(r.squared)

Use `lm` instead of `glm` with `family = gaussian`. There is no R-squared value calculated by `summary.glm`. — Roland, Jun 28 '21 at 11:21
@Roland, that's it, thank you. Of course, this was again way simpler than I envisioned :) — SHW, Jun 28 '21 at 12:25
@Roland, do you know if an output of R2 exists within the glm functionality? — SHW, Jun 29 '21 at 09:16
No, in general, for a generalized linear model, R-squared is not a sensible concept. — Roland, Jun 29 '21 at 09:58

score 2 · Answer 1 · answered Jul 25 '21 at 14:46

As pointed out in the comments by @Roland, you can use lm() and call the summary() function,

summary(lm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
))$r.squared

 0.9552469

Or since we know the formula of R squared for a ordinary least square is:

We can pull this out from your glm results:

mdl <- glm(
  wellbeing ~ education + 0, 
  data = df,
  family = gaussian
)

1 - mdl$deviance/mdl$null.deviance

 0.9552469

GLM - No R-squared output when running simple linear regression with categorical predictor

1 Answers1