1

I want to determine the marginal effects of each dependent variable in a probit regression as follows:

  • predict the (base) probability with the mean of each variable
  • for each variable, predict the change in probability compared to the base probability if the variable takes the value of mean + 1x standard deviation of the variable

In one of my regressions, I have a multiplicative variable, as follows:

my_probit <- glm(a ~ b + c + I(b*c), family = binomial(link = "probit"), data=data)

Two questions:

  1. When I determine the marginal effects using the approach above, will the value of the multiplicative term reflect the value of b or c taking the value mean + 1x standard deviation of the variable?
  2. Same question, but with an interaction term (* and no I()) instead of a multiplicative term.

Many thanks

bdu
  • 302
  • 1
  • 3
  • 12
  • Can you clarify how you are using the term "marginal" here? I would think of the marginal relationship b/t a & b to be the output from `a~b` (ie, w/o the `+c+I(b*c)`). Also, I'm confused about your distinction b/t "interaction term" & "multiplicative term". These seem synonymous to me. Eg, `a+b+I(a*b)`==`a*b`. – gung - Reinstate Monica Jun 23 '13 at 17:42
  • My understanding is that the I() function and the interaction functions (*, : or /) are not quite the same, see [link](http://cran.r-project.org/doc/manuals/R-intro.html#Formulae-for-statistical-models). By marginal effect, I mean the effect that a marginal change (by one standard deviation in my case) of the dependent variable will have on the predicted probability, to assess its economic significance. – bdu Jun 23 '13 at 18:16
  • 1
    Reviewing that link it seems that the case of a numeric-by-numeric interactions is not actually addressed. And it turns out on testing that @gung is correct in thinking that the results for `a+b+I(a*b)` and `a*b` are equivalent. Clearly they would _not_ be equivalent for `a*a` (which is same as `a`) and `I(a*a)` which is `I(a^2)`. – IRTFM Jun 23 '13 at 19:11
  • OK, thanks for clarifying this. This would mean that my two questions are actually one single question. Can someone help with question 1? Thanks. – bdu Jun 23 '13 at 20:57

1 Answers1

4

When interpreting the results of models involving interaction terms, the general rule is DO NOT interpret coefficients. The very presence of interactions means that the meaning of coefficients for terms will vary depending on the other variate values being used for prediction. The right way to go about looking at the results is to construct a "prediction grid", i.e. a set of values that are spaced across the range of interest (hopefully within the domain of data support). The two essential functions for this process are expand.grid and predict.

dgrid <- expand.grid(b=fivenum(data$b)[2:4], c=fivenum(data$c)[2:4]
# A grid with the upper and lower hinges and the medians for `a` and `b`.

predict(my_probit,  newdata=dgrid)

You may want to have the predictions on a scale other than the default (which is to return the linear predictor), so perhaps this would be easier to interpret if it were:

predict(my_probit,  newdata=dgrid, type ="response")

Be sure to read ?predict and ?predict.glm and work with some simple examples to make sure you are getting what you intended.

Predictions from models containing interactions (at least those involving 2 covariates) should be thought of as being surfaces or 2-d manifolds in three dimensions. (And for 3-covariate interactions as being iso-value envelopes.) The reason that non-interaction models can be decomposed into separate term "effects" is that the slopes of the planar prediction surfaces remain constant across all levels of input. Such is not the case with interactions, especially those with multiplicative and non-linear model structures. The graphical tools and insights that one picks up in a differential equations course can be productively applied here.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • +1, this is the right advice. A similar approach is to plot the relationship b/t `a` & `b` at several different values of `c` (eg, the mean of `c` & +/- `SD). I illustrated a case of plotting several different lines (albeit w/o the interaction) in an answer on CV here: [Graphing a probability curve for a logit model with multiple predictors](http://stats.stackexchange.com/questions/31597//31600#31600). – gung - Reinstate Monica Jun 23 '13 at 23:29
  • Thanks. This is indeed very useful. For the avoidance of doubt, I would have thought it should be possible to estimate two marginal effects for the interaction term: one for mean(a) + SD(a) and mean(b), and another one for mean(a) and mean(b) + SD(b). Should I rule out that approach completely? If not, how should I go about it? – bdu Jun 24 '13 at 06:57
  • I think you should get out of the notion that there are "interaction term" and "marginal terms". You need to think of the linear and interaction contributions to a single interaction pattern. You should use at least mean and +/- SD for both variates. It would only be slightly more complex than median median +/- hinges approach that I offered. – IRTFM Jun 24 '13 at 15:07