0

I am trying to compare the two models below

H1 <- lm(y ~ x1 + x2, data = df) 
H2 <- lm(y ~ x1 + x2 + x3, data = df)

anova(H1, H2)

However, I get an error message:

Error: Argument 'data' must be a data frame

And when I define the data, then I get another error message:

anova(H1, H2, data = df)

Error in .subset2(x, i) : recursive indexing failed at level 2

I tried to look at the models and they show (not sure if I am looking at the correct one, but):

H1
model   list[89 x 3] (S3: data.frame) A data.frame with 89 rows and 3 columns
y       double[89]                   3.00   3.50  4.25 5.11  1.00 ...
x1       double[89]                   19    24   31    35   20   21 ...
x2       double[89]                   1 1 1 1 2 1 1 ...

str(H1)
List of 12
 $ coefficients : Named num [1:3] 5.42739 0.000294 -0.950346
  ..- attr(*, "names")= chr [1:3] "(Intercept)" "x1" "x2"
 $ residuals    : Named num [1:89] -1.4844 -0.9835 -0.2326 -2.5338 0.0177 ...
  ..- attr(*, "names")= chr [1:89] "1" "2" "3" "4" ...
 $ effects      : Named num [1:89] -40.783 0.796 -3.258 -2.349 0.068 ...
  ..- attr(*, "names")= chr [1:89] "(Intercept)" "x1" "x2" "" ...
 $ rank         : int 3
 $ fitted.values: Named num [1:89] 4.48 4.48 4.48 3.53 4.48 ...
  ..- attr(*, "names")= chr [1:89] "1" "2" "3" "4" ...
 $ assign       : int [1:3] 0 1 2
 $ qr           :List of 5
  ..$ qr   : num [1:89, 1:3] -9.434 0.106 0.106 0.106 0.106 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:89] "1" "2" "3" "4" ...
  .. .. ..$ : chr [1:3] "(Intercept) "x1" "x2"
  .. ..- attr(*, "assign")= int [1:3] 0 1 2
  ..$ qraux: num [1:3] 1.11 1.02 1.03
  ..$ pivot: int [1:3] 1 2 3
  ..$ tol  : num 1e-07
  ..$ rank : int 3
  ..- attr(*, "class")= chr "qr"
 $ df.residual  : int 86
 $ xlevels      : Named list()
 $ call         : language lm(formula = y ~ x1 + x2, data = df)
 $ terms        :Classes 'terms', 'formula'  language y ~ x1 + x2
  .. ..- attr(*, "variables")= language list(y, x1, x2)
  .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:3] "y" "x1" "x2"
  .. .. .. ..$ : chr [1:2] "x1" "x2"
  .. ..- attr(*, "term.labels")= chr [1:2] "x1" "x2"
  .. ..- attr(*, "order")= int [1:2] 1 1
  .. ..- attr(*, "intercept")= int 1
  .. ..- attr(*, "response")= int 1
  .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. ..- attr(*, "predvars")= language list(y, x1, x2)
  .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
  .. .. ..- attr(*, "names")= chr [1:3] "y" "x1" "x2"
 $ model        :'data.frame':  89 obs. of  3 variables:
  ..$ y : num [1:89] 3 3.5 4.25 1 4.5 5.25 4.75 3.75 3.5 5 ...
  ..$ x1    : num [1:89] 25 22 19 24 18 24 18 18 21 19 ...
  ..$ x2 : num [1:89] 1 1 1 2 1 1 1 1 1 1 ...
  ..- attr(*, "terms")=Classes 'terms', 'formula'  language y ~ x1 + x2
  .. .. ..- attr(*, "variables")= language list(y, x1, x2)
  .. .. ..- attr(*, "factors")= int [1:3, 1:2] 0 1 0 0 0 1
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : chr [1:3] "y" "x1" "x2"
  .. .. .. .. ..$ : chr [1:2] "x1" "x2"
  .. .. ..- attr(*, "term.labels")= chr [1:2] "x1" "x2"
  .. .. ..- attr(*, "order")= int [1:2] 1 1
  .. .. ..- attr(*, "intercept")= int 1
  .. .. ..- attr(*, "response")= int 1
  .. .. ..- attr(*, ".Environment")=<environment: R_GlobalEnv> 
  .. .. ..- attr(*, "predvars")= language list(y x1 x2)
  .. .. ..- attr(*, "dataClasses")= Named chr [1:3] "numeric" "numeric" "numeric"
  .. .. .. ..- attr(*, "names")= chr [1:3] "y" "x1" "x2"
 - attr(*, "class")= chr "lm"


H2
model   list[89 x 3] (S3: data.frame) A data.frame with 89 rows and 4 columns
y       double[89]                   3.00   3.50  4.25 5.11  1.00 ...
x1       double[89]                   19    24   31    35   20   21 ...
x2       double[89]                   1 1 1 1 2 1 1 ...
x3       double[89]                   0 0 0 0 1 0 0 



both have xlevels list[0] Let me know if you need more information.

I would really appreciate it if you could help me out with this!

user240313
  • 17
  • 6
  • 2
    Try without `data` in anova, just: `anova(H1, H2)` – pogibas May 23 '19 at 09:53
  • Yup I first started with that, but that didn't work, so I defined the data with `data =` Thanks though – user240313 May 23 '19 at 09:59
  • Please check https://grokbase.com/t/r/r-help/106tteq1ab/r-recursive-indexing-failed-at-level-2 – pogibas May 23 '19 at 10:00
  • Thanks for the comment; I do get what the post in the link is about but I cannot seem to apply that to my problem. All the x's and y are columns of the data.frame & are as vectors - what should I do with these exactly..? – user240313 May 23 '19 at 10:21

1 Answers1

0

Anova calls the data function from the lm objects. Are you sure they are right?

The term data in fit should be a data frame with columns x1, x2 and y.

If x1, x2 and y are vectors you don't use the data argument.

Jorge Mendes
  • 176
  • 1
  • 6
  • Thanks for the comment! Yes x1, x2, and y are all columns of the data, df, but I do see that when I do `is.vector(df$x1)` then I get TRUE. Do I then need to change each of these into a data.frame using as.data.frame() function? – user240313 May 22 '19 at 16:32
  • Can you run is.data.frame(df)? To know if df is really a data.frame. Or df <- data.frame(df) to make sure its in the right format. – Jorge Mendes May 22 '19 at 16:37
  • Yes I just checked, and df is a data.frame – user240313 May 22 '19 at 16:40
  • Also did df <- data.frame(df) but still doesn't work unfortunately! – user240313 May 22 '19 at 17:49
  • Try as.data.frame instead. But better yet post the output of `str(df)` before you tried to do any conversation to data frames. – Dason May 23 '19 at 12:20