1

I try to use MuMIn::dredge() on a global model to give me my candidate models, given certain criteria. I've read ?dredge and understood some of it, but I still have some question on how to include one of my criteria:

If I have a global model with e.g.

y ~ X1 + X2 + X3 + X4 + X5 + X6 + X7 + X1:X2 + X2:X3 + X3:X4 + X4:X6 + X5:X7

(several main effects and several interaction) and I want to specify that I only want dredge to return models which include one interaction at a time, how do I subset this in an easy way?

Also, if the global model also includes a second degree polynomial of a parameter

Y ~ X1 + X1^2 + X2 + X3 + X4

and I want to specify that these two should always exist together in the models (main effects X1 never alone without X1^2) I understood the syntax for this is (agree?):

dredge(global.model, subset=(X1^2|!X1))

And if I have understood it correctly, dredge() is taking care of the other way around (the X1^2 will only occur in the model if X1 is in the model - same for interactions which will never occur without the main effects present)?

But how is the syntax for second degree polynomials inside dredge()? Am I right that it's something like this:

dredge(global.model, subset=({I(X1^2)}|!X1))

?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Mary
  • 11
  • 3
  • Update: I guess that there is a possibility regarding the first question, doing a long row of subset=!(X1:X2&&X2:x3) && !(X1:X2&&X3:X4) && and so on - but does it exist a shortcut for this kind of subsetting in dredge which makes it more easy to limit the number of interactions present at the same time? – Mary Feb 28 '18 at 13:01

1 Answers1

1

Not a most elegant solution, but it works:

library(MuMIn)

# example global model with many interactions:
fm <- lm(y ~ (X1 + X2 + X3 + X4)^2, Cement, na.action = na.fail)

# create a vector of interaction term names:
x <- c(getAllTerms(fm))
x <- x[grep(":", x)] # won't work if any variable name has ":" in it.

# create a subset expression (sum of interactions < N):
ss <- substitute(sum(X) < N, list(N = 3, X = as.call(lapply(c("c", x), as.symbol))))

# the resulting expression is:
sum(c(`X1:X2`, `X1:X3`, `X1:X4`, `X2:X3`, `X2:X4`, `X3:X4`)) < 3

dd <- dredge(fm, subset = ss)

# verify:
max(rowSums(!is.na(dd[, x]))) # --> 2

Edit: better interaction detection, and wrapped into function:

subsetExprInteractionLimit <- function(model, N = 1) {
    x <- getAllTerms(model)
    x <- c(x)[attr(x, "order")][attr(terms(model), "order") > 1]
    substitute(sum(X) <= N, list(N = N, X = as.call(lapply(c("c", x), as.symbol))))
}

subsetExprInteractionLimit(fm, N = 1) # limit to 1 interaction
Kamil Bartoń
  • 1,482
  • 9
  • 10
  • Thanks! As you say, not the most elegant solution but this worked in my dataset as well and was way more elegant than the solution I came up with on my own. – Mary Mar 01 '18 at 08:28
  • On second thought, this did not work when trying to implement other criterias in subset at the same time. E.g. if i want X1 and X2 to always be present together, never on their own - at the same time as I only want one interaction present. – Mary Mar 07 '18 at 14:31
  • You could use something like `substitute({other_criteria} & (XXX), list(XXX= subsetExprInteractionLimit(...)))`. – Kamil Bartoń Mar 09 '18 at 15:25