2

I've created a linear regression model in R that contains the following interaction terms.

lm.data <- lm(sharer_prob ~ sympathy + trust + fear + greed, na.action=NULL, data=data)

Greed, Sympathy, Trust and fear are independent variables with allowable values of 0, 1, 2, or 3. The response variable is sharer_prob, which has values from 0 to 1. The model contains the following interaction terms.

IX_greed <- data$greed * data$sharer_prob
IX_sympathy <- data$sympathy * data$sharer_prob
IX_fear <- data$fear * data$sharer_prob
IX_trust <- data$trust * data$sharer_prob

That makes it possible for me to regress pairs of the independent variables like so:

lmFGData=lm( data$sharer_prob ~ IX_fear * IX_greed )
lmFSData=lm( data$sharer_prob ~ IX_fear * IX_sympathy )
lmFTData=lm( data$sharer_prob ~ IX_fear * IX_trust )
lmGSData=lm( data$sharer_prob ~ IX_greed * IX_sympathy )
lmGTData=lm( data$sharer_prob ~ IX_greed * IX_trust )
lmTSData=lm( data$sharer_prob ~ IX_trust * IX_sympathy ) 

Unfortunately, the resulting models fail three of the four assumptions for linear regression. So I created a new model that regresses the logit of sharer_prob against the independent variables like so:

lm.Logitdata=lm(logit(sharer_prob, , ) ~ sympathy + trust + fear + greed, na.action=NULL, data=data)

How do I create expressions that regress the interacting pairs of variables?

  • Option A: Use the same expressions, but change the name of the
    objects that represent each new model?
  • Option B: Create an dataframe containing the independent variables and the transformed response variable, and use that in each expression?
  • Option C: Do something else?

Many thanks for any help you can offer.

Larry John

Larry John
  • 41
  • 3
  • You can supply interaction terms into the model equation. For example: summary(lm(mpg ~ cyl + hp + wt + gear + gear * cyl, data = mtcars)). Is this what you are after? I may be missing something. – Gopala Jan 10 '16 at 18:44
  • Yes, that's what I did in the original model. The question is, do I need to change the source of the data in the new model to account for the fact that instead of fitting an equation to compute sharer_prob, it will be fitting an equation to compute the logit of sharer_prob? – Larry John Jan 10 '16 at 18:49
  • I have not used logit fitting this way before. I use glm(formula, family = 'binomial', data = data). That method takes the raw data frame and does the right thing. – Gopala Jan 10 '16 at 18:58
  • Yes, that works fine if your resposne is binomial. Mine is not. – Larry John Jan 10 '16 at 19:05
  • @LarryJohn Can you explain why you calculate interactions *with the outcome* variable? Surely this is a special case of what Kronmal calls "The Fallacy of the Ratio". The modeler should consider interactions between *input* regressors, not the output. – AdamO May 18 '18 at 18:23

0 Answers0