1

I found other questions regarding this topic, such as this, however I am keep getting the error message

Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ

Below is the code I am using:

library(DAAG)
attach(ultrasonic)

g.poly = lm(UR ~ poly(MD, 3), data = ultrasonic)
cv.poly <- cv.lm(ultrasonic, g.poly ,m=3, plotit=TRUE, printit=TRUE, dots=FALSE, seed=29) 

Of course, the length is same:

> length(UR)
[1] 214
> length(MD)
[1] 214

Note that in the same script, I perform another linear regression with crossvalidation, which works.

library(DAAG)
g.lin = lm(log(UR) ~ MD, data = ultrasonic)
cv.lin <- cv.lm(ultrasonic, g.lin ,m=3, plotit=TRUE, printit=TRUE, dots=FALSE, seed=29)

Any idea why the polynomial regression crossvalidation does not work?

EDIT

To get the data:

install.packages('nlsmsn')
library('nlsmsn')
data(Ultrasonic)

#names differ, i am using copy in local machine with lower case u(ultrasonic) and different column names, but data are identical.
#UR = y
#MD = x
Community
  • 1
  • 1
HonzaB
  • 7,065
  • 6
  • 31
  • 42
  • First, do not `attach` the data.frame. Then, your error results from plotting. Do you get the same error if you switch off plotting in `cv.lm`? Finally, a reproducible example is needed for further diagnosis. – Roland Dec 02 '16 at 07:46
  • I'd be glad to be wrong but there seems to be bugs and badly formatted if else statements in `cv.lm` – Vincent Bonhomme Dec 02 '16 at 07:52
  • `Ultrasonic` not `ultrasonic` btw? – Vincent Bonhomme Dec 02 '16 at 07:54
  • @Roland - In the answers I found, they always recommend to attach the dataframe. But it is same when detached. Further, I still get the error when I switch off the plotting. – HonzaB Dec 02 '16 at 07:55
  • You can trust @Roland and burn all books that defend `attach`ing. – Vincent Bonhomme Dec 02 '16 at 07:57
  • No `UR` or `MD` variables in the `Ultrasonic` dataset ... – Roland Dec 02 '16 at 07:58
  • @ZheyuanLi : tried to extract the function and run it manually... quite a nightmare yet I still hope _I_'m the problem – Vincent Bonhomme Dec 02 '16 at 07:59
  • @roland, sorry, names differ. See the update. – HonzaB Dec 02 '16 at 08:00
  • @Roland ```g.poly = lm(y ~ poly(x, 3), data = ultrasonic); cv.poly <- cv.lm(ultrasonic, g.poly ,m=3, plotit=TRUE, printit=TRUE, dots=FALSE, seed=29) ```doesnt work either anyway ;-) ! – Vincent Bonhomme Dec 02 '16 at 08:01
  • @HonzaB are you trying to get more than residuals and/or other measure of how well fitted are your points ? in others words, do you have the feeling that `predict` is not enough? – Vincent Bonhomme Dec 02 '16 at 08:02
  • @VincentBonhomme well all i need is to fit two or three regression models and somehow compare their performance. So i was thinking CV can be good here. – HonzaB Dec 02 '16 at 08:06

1 Answers1

2

DAAG:::cv.lm obviously does not support everything you can do with lm, e.g., it does not support functions in the formula. You need to take an intermediate step.

mf <- as.data.frame(model.matrix(y ~ poly(x), data = Ultrasonic))
mf$y <- Ultrasonic$y
mf$`(Intercept)` <- NULL

#sanitize names
names(mf) <- make.names(names(mf))
#[1] "poly.x." "y"  
g.poly.san <- lm(y ~ ., data = mf)

cv.poly <- cv.lm(mf, g.poly.san, m=3, plotit=TRUE, printit=TRUE, dots=FALSE, seed=29) 
#works
Roland
  • 127,288
  • 10
  • 191
  • 288