1

I am working on an exercise asking me "Plot the residuals against Y_hat, each predictor variable, and each two-factor interaction term on separate graphs." Here is a snippet of the data set I am using:

> dput(head(Commercial_Properties, 10))
structure(list(Rental_Rates = c(13.5, 12, 10.5, 15, 14, 10.5, 
14, 16.5, 17.5, 16.5), Age = c(1, 14, 16, 4, 11, 15, 2, 1, 1, 
8), Op_Expense_Tax = c(5.02, 8.19, 3, 10.7, 8.97, 9.45, 8, 6.62, 
6.2, 11.78), Vacancy_Rate = c(0.14, 0.27, 0, 0.05, 0.07, 0.24, 
0.19, 0.6, 0, 0.03), Total_Sq_Ft = c(123000, 104079, 39998, 57112, 
60000, 101385, 31300, 248172, 215000, 251015), residuals = c(`1` = -1.03567244005944, 
`2` = -1.51380641405037, `3` = -0.591053402133659, `4` = -0.133568082335235, 
`5` = 0.313283765150399, `6` = -3.18718522392237, `7` = -0.538356748944345, 
`8` = 0.236302385996349, `9` = 1.98922037248654, `10` = 0.105829602747806
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

From here I created the proper linear model that includes two factor interaction terms:

commercial_properties_lm_two_degree_interaction <- 
  lm(data=Commercial_Properties, 
     formula=Rental_Rates ~ (Age + Op_Expense_Tax + Vacancy_Rate + Total_Sq_Ft)^2)

Next what I was hoping to accomplish was to plot the residuals not just of the linear terms, but also of the interaction terms. I attempted to do this using the residualPlots() function in the car package

library(car)
residualPlots(model=commercial_properties_lm_two_degree_interaction, 
              terms=~ (Age + Op_Expense_Tax + Vacancy_Rate + Total_Sq_Ft)^2)

When applied in this way the output only produced the residual plots against the linear terms, it didn't plot any interactions. So I then attempted to do it manually, but I got an error:

residualPlots(model=commercial_properties_lm_two_degree_interaction,
              terms=~ Age + Op_Expense_Tax + Vacancy_Rate + Tota_Sq_Ft + 
                Age:Op_Expense_Tax + Age:Vacancy_Rate)

Error in termsToMf(model, terms) : argument 'terms' not interpretable.

Now if I were to do things completely manually I was able to get an interaction plot for example:

with(data=Commercial_Properties, plot(x=Op_Expense_Tax * Vacancy_Rate, y=residuals))

plotted successfully. My issue is that sure I can do this completely manually for a reasonably small amount of variables, but it will get extremely tedious once the amount of variables begins to get larger.

So my question is if there is a way to use an already created function in R to make residual plots of the interaction terms or would I be left to doing it completely manually or most likely having to write some sort of loop ?

Note, I'm not asking about partial residuals. I haven't gotten to that point in my text I'm using. Just plain interaction terms against residuals.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
D.C. the III
  • 340
  • 3
  • 14

1 Answers1

1

You could do an eval(parse()) approach using the 'term.labels' attribute.

With gsub(':', '*', a[grep(':', a)]) pull out the interaction terms and replace : with * so it can be evaluated.

a <- attr(terms(commercial_properties_lm_two_degree_interaction), 'term.labels')

op <- par(mfrow=c(2, 3))
with(Commercial_Properties, 
     lapply(gsub(':', '*', a[grep(':', a)]), function(x)
            plot(eval(parse(text=x)), residuals, xlab=x)))
par(op)

enter image description here

Edit

This is how we would do this with a for loop in R (but see comments below):

as <- gsub(':', '*', a[grep(':', a)])
op <- par(mfrow=c(2, 3))
for (x in as) {
  with(Commercial_Properties, 
       plot(eval(parse(text=x)), residuals, xlab=x)
  )
}
par(op)
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • I've never used the `gsub` function before. I'm going to take a look at it. Since I am doing this to learn more techniques. Thanks for the suggestion. and I just checked your profile. Thanks for the link to an R style guide. I always have questions of best practices when I'm doing things. – D.C. the III Oct 04 '21 at 19:29
  • 1
    So I went over your piece of code to understand it a bit more since it had some functions I haven't used, but from reading up on will be useful. Is this how the code in `lapply` is working: `gsub()` is finding the elements in my vector `a` that have a `:` replacing them with `*`, these were found through the `grep()` function. Here is where I got a little confused. We specify `function(x)`, specifically we are going to apply the `plot` function. To get the x-axis terms, `eval` is used to get the values for each of the individual set of interactions. So is `lapply` sort of like the `for` loop? – D.C. the III Oct 06 '21 at 23:24
  • 1
    my main question is why do we use the `function(x)` command in `lapply`? I tried it with `FUN = plot(....)` and it failed. I'm not sure of that object's (if that is the right term) behaviour. – D.C. the III Oct 06 '21 at 23:32
  • 1
    @dc3rd I appreciate how closely you look at the code! Yes exactly, `lapply` _is_ the `for` loop, but [implemented in C](https://github.com/wch/r-source/blob/79298c499218846d14500255efd622b5021c10ec/src/main/apply.c), and therefore much faster. I added an edit to show how this would be done in an R `for` loop. The `function(x)` precedes the body of a so-called _anonymous function_ that we define afterwards; without it, `plot` won't know what `x` is and which is why you probably failed. See also `help("function")`. – jay.sf Oct 07 '21 at 04:47