0

I have a model with several main effects and several interactions. I want to avoid any models that would only include the 3 interaction terms. So basically all variations of main effects and main effects with various interactions but not anything with only the interactions.

M1<-glm(R1 ~ scale(X1)+ scale(X2)+ scale(X3)+ scale(X3*X1)+scale(X2*X1)+scale(X2*X3))

I have used 'expression' to subset before for quadratics and it's always worked but for some reason I can't figure out the interactions.

msubset <- expression((`scale(X2)`|!`scale(X2):scale(X1)`)& 
(`scale(X3)`|!`scale(X3):scale(X1)`))
#dredge for model selection
M2<-dredge(M1,subset=msubset,rank=AIC)

Any help would be appreciated.

Rebecca
  • 1
  • 1

1 Answers1

0

I have never used this functionality in dredge before (it's pretty great to know it exists). I think your issue lies in the use of the scale() functions in your formula. I don't think this is really necessary. Your model's coefficients are of course influenced by the amplitude of the explanatory variables, but this does not have any effect on whether or not they are determined to help explain the response variable. If possible, I recommend removing these and then the procedure that you wand seems to work in dredge():

library("MuMIn")

n <- 100
set.seed(1)
df <- data.frame(R1 = rnorm(n), X1 = rnorm(n), X2 = rnorm(n), X3 = rnorm(n))

fmla <- formula(R1 ~ (X1 + X2 + X3)^2)
M1 <- glm(formula = fmla, data = df)
summary(M1)

options(na.action = "na.fail")
M2 <- dredge(global.model = M1, subset = (X2 |!X1:X2) & (X3 |! X1:X3), rank = AIC)
options(na.action = "na.omit")
M2

# Global model call: glm(formula = fmla, data = df)
# ---
# Model selection table 
#      (Int)        X1      X2       X3    X1:X2   X1:X3   X2:X3 df   logLik   AIC delta weight
# 1  0.10890                                                      2 -130.655 265.3  0.00  0.253
# 56 0.11640  0.068320 0.03391 -0.04753          -0.2215 -0.1414  7 -126.119 266.2  0.93  0.159
# 5  0.11120                   -0.04568                           3 -130.528 267.1  1.75  0.106
# 3  0.10840           0.01596                                    3 -130.638 267.3  1.97  0.095
# 24 0.10090  0.056060 0.03889 -0.06773          -0.2180          6 -127.758 267.5  2.21  0.084
# 64 0.11350  0.077270 0.03040 -0.04011 -0.07512 -0.2003 -0.1377  8 -125.846 267.7  2.38  0.077
# 39 0.12550           0.01554 -0.02882                  -0.1369  5 -129.039 268.1  2.77  0.063
# 32 0.09808  0.066770 0.03469 -0.05856 -0.08673 -0.1937          7 -127.404 268.8  3.50  0.044
# 7  0.11070           0.02107 -0.04812                           4 -130.498 269.0  3.69  0.040
# 40 0.12590  0.008263 0.01585 -0.02832                  -0.1374  6 -129.035 270.1  4.76  0.023
# 48 0.11880  0.035720 0.01253 -0.01789 -0.14030         -0.1312  7 -128.040 270.1  4.77  0.023
# 16 0.10380  0.027010 0.01719 -0.03621 -0.14930                  6 -129.399 270.8  5.49  0.016
# 8  0.11070 -0.002730 0.02096 -0.04826                           5 -130.498 271.0  5.69  0.015
# Models ranked by AIC(x)  

You don't actually have a 3-way interaction in your formula (which would include (X1:X2:X3, and would be included by the formula R1 ~ (X1 + X2 + X3)^3.

Finally, if the use of scale is important to you, and you do not have an issue with computation time, you could always filter your resulting M2 table following your subset criteria afterwards.

Marc in the box
  • 11,769
  • 4
  • 47
  • 97