0

I have a logit model with 4 independent variables:

logit <- glm(y ~ x1 + x2 + x3 + x4, family = binomial(), data = df)

All variables in the model are dichotomous (0,1).

I want to get the predicted probabilities for when x1=1 vs x1=0, while holding all other variables in the model constant, using the following code:

mean(predict(logit,transform(df,x1=1),type='response')) 
mean(predict(logit,transform(df,x1=0),type='response')) 

Is this the correct way to do this? I'm asking because I tried a few different logit models and compared the difference between the two values produced by the code above with the number the margins command produces:

summary(margins(logit, variables="treatment"))

For most models I tried, the difference between the two values equals the number produced by the margins command above, but for one model specification the numbers were slightly different.

Finally, when I use weights in my logit model, do I have to do anything differently when I calculate the predicted probabilities than when I don't include any weights?

Rene
  • 81
  • 5
  • Your code makes sense to me. If you're using weights, perhaps you would want a weighted mean to calculate the average probability rather than an unweighted mean. – DaveArmstrong Oct 18 '22 at 19:08

1 Answers1

1

Your approach seems reasonable. The only thing I’ll note is that there are many packages out there which will do this automatically for you, and can estimate standard errors. One example is the marginaleffects package. (Disclaimer: I am the author.)

library(marginaleffects)

mod <- glm(am ~ vs + hp, data = mtcars, family = binomial)

# average predicted outcome for observed units with vs=1 or vs=0
predictions(
    mod,
    by = "vs")
#>       type vs predicted std.error statistic      p.value  conf.low conf.high
#> 1 response  0 0.3333333 0.1082292  3.079883 0.0020708185 0.1212080 0.5454587
#> 2 response  1 0.5000000 0.1328955  3.762356 0.0001683204 0.2395297 0.7604703

# are the average predicted outcomes equal?
predictions(
    mod,
    by = "vs",
    hypothesis = "pairwise")
#>       type  term  predicted std.error  statistic   p.value   conf.low conf.high
#> 1 response 0 - 1 -0.1666667 0.1713903 -0.9724392 0.3308321 -0.5025855 0.1692522

FYI, there is a wts argument to handle weights in the predictions() function.

Edit:

predictions(
    mod,
    newdata = datagridcf(vs = 0:1),
    by = "vs")

predictions(
    mod,
    variables = list(vs = 0:1),
    by = "vs")
Vincent
  • 15,809
  • 7
  • 37
  • 39
  • I tested your command and I noticed that my code produces quite different predicted probabilities than the 'predictions' function when I use more than one independent variable (one probability is off by about 5 percentage points). This is true whether I use weights or not. Do you have any guesses why that is? – Rene Oct 18 '22 at 19:55
  • My code only computes predicted probabilities for the *actually observed* rows of the dataset. Your code creates two different datasets with all rows, but with the variable alternatively set to 0 or 1. You can achieve the same thing by adding this argument: `newdata = datagridcf(vs = 0:1)`. The `datagridcf()` creates a "counterfactual" grid where the full dataset is replicated once for every combination of the variables supplied. In this case, we get 2 duplicated data, with values of `vs` set to 0 and 1. – Vincent Oct 18 '22 at 20:39
  • Could you elaborate what you mean by “actually observed rows”? (None of the variables in my model have missing data, in case you mean that my code that calculates predicted probabilities includes cases that aren’t included in the logit model due to missingness.) – Rene Oct 18 '22 at 20:51
  • Your code calculates one predicted value for every row in `df` with all values of `x1` set to zero. Then, it calculates one predicted value for every row in `df`, with all values of `x1` set to 0. It's as if you had duplicated the dataset twice to compute two sets of "counterfactual" predictions. My original code calculates one predicted value for each row of the dataset, using the actually observed values of the regressor, instead of setting them to counterfactual values. I edited my post with two alternative syntax to get the same thing you are doing. – Vincent Oct 18 '22 at 21:00
  • Thanks! I'll have to try to figure out which of these two "types" of predicted probabilities provide a better estimate of the effect of x1 on y. The "summary(margins(l2, variables="treatment"))" command that produces the average marginal effect of x1 on y seems to more closely (but not exactly) correspond to the way I originally calculated the predicted probabilities. That is, the margins command gives a value of .0706, and the difference between the predicted probabilities I originally calculated is .0716, while the difference based on the code you first shared is .12431. – Rene Oct 18 '22 at 22:10
  • `margins` calculates the slope. This is not equivalent to the manual computations you showed above. The ˋmarginaleffects` function of my package replicated and enhances `margins` – Vincent Oct 18 '22 at 23:53