1

I am trying to do a cross-validation of a glm model, but get an error regarding the colnames in the input. Does anyone no why?

    df.t <- structure(list(hsa_miR_1271_5p = c(5.49810955587331, 2.59625048602785, 
-1.18451789616878, 5.15323970237091, 0.140211440674119, 3.04813811986249
), hsa_miR_1306_5p = c(3.86825008468275, 4.11453141456941, 4.74606690723312, 
4.07411857512024, 4.21025999335593, 3.34936453244035), hsa_miR_3196 = c(5.34473949644032, 
-1.11507439046225, -1.18451789616878, 5.15323970237091, -1.08209121203209, 
0.527829025138608), hsa_miR_4484 = c(-1.18870525729212, 2.0485452441295, 
-1.18451789616878, -1.36849402655295, 0.140211440674119, 2.62754403295089
), hsa_miR_4791 = c(3.08850377258275, 3.34342021798402, 4.74606690723312, 
-1.36849402655295, 2.39264491482849, 3.55905381780721)), row.names = c("1025", 
"1101", "1330", "1428", "1473", "175"), class = "data.frame")


OverallStatus.discovery  <- c(0L, 0L, 0L, 0L, 0L, 1L)

FML <- OverallStatus.discovery ~ hsa_miR_1306_5p + hsa_miR_3196 + hsa_miR_1271_5p + 
hsa_miR_4484 + hsa_miR_4791

multifit.discovery <-glm(FML,family=binomial(), data = df.t)

library(boot)
cv.mse <- cv.glm(df.t, multifit.discovery) 

Error in model.frame.default(formula = FML, data = list(hsa_miR_1271_5p = c(5.49810955587331, : variable lengths differ (found for 'hsa_miR_1306_5p')

cuttlefish44
  • 6,586
  • 2
  • 17
  • 34
user2300940
  • 2,355
  • 1
  • 22
  • 35

1 Answers1

0

?cv.glm says

Usage
cv.glm(data, glmfit, cost, K)
Arguments
data
A matrix or data frame containing the data. The rows should be cases and the columns correspond to variables, one of which is the response.

So maybe below is what you want;

df.t2 <- cbind(df.t, OverallStatus.discovery = OverallStatus.discovery)
cv.mse <- cv.glm(df.t2, multifit.discovery) 
cuttlefish44
  • 6,586
  • 2
  • 17
  • 34