0

I am experiencing the same error as raised and answered here (cv.glm variable lengths differ) and in various other threads. Despite using the "correct" formula structure as suggested in all these threads, the error persists:

mod <- glm(Y ~ Var_1, data = df, family = binomial)

cv.glm(df, mod, K=8)

Error in model.frame.default(formula = Y ~ Var_1, data = list( : variable lengths differ (found for 'Var1')

Are there any other known sources of this issue?

user303287
  • 131
  • 5
  • 1
    can you post an example that reproduces your error or `dput` your data? Also edit your question to include which `library` do you use.With simulated data, your code work properly – Elia May 12 '21 at 15:03
  • Thanks. Actually, think I just found the answer whilst preparing my dataset for this post. I will post below – user303287 May 12 '21 at 16:30

1 Answers1

0

My response variable was actually defined as

Y <- cbind(df$Y2, df$Y1-Y2), and so whilst the model formula looked like it was in the correct format, the way my response variable was created posed an issue.

If I use the alternative of:

mod <- glm(Y2/Y1 ~ Var1, family = binomial, data = df, weights = Y1)

then running boot::cv.glm(df, mod, K = 8) works.

user303287
  • 131
  • 5