0

I am trying to get a prediction curve for a logistic glm in R. I create a new data frame but do not understand why I keep getting this error. Thank you for your help!

glm1 <- glm(dataset_presence$presence ~ dataset_presence$Int_Bk, data = dataset_presence, family = binomial, na.action = na.exclude)

newdat <- data.frame(Int_Bk = seq(min(dataset_presence$Int_Bk), max(dataset_presence$Int_Bk), length=50))

newdat$presence <- predict(glm1, newdata = newdat, type="response")

Error in $<-.data.frame(*tmp*, presence, value = c(0.862135653229272, : replacement has 167 rows, data has 50 In addition: Warning message: 'newdata' had 50 rows but variables found have 167 rows

fabla
  • 1,806
  • 1
  • 8
  • 20

1 Answers1

0

You need to specify the model without the data$variable in the formula:

glm1 <- glm(presence ~ Int_Bk, data = dataset_presence, family = binomial, na.action = na.exclude)

Once you do that, your prediction will work. That said, if you're trying to plot the curve, the ggeffects and effects package have some really useful functions. For example, using the Chile data from the carData package:

library(ggeffects)
data("Chile", package="carData")

Chile <- Chile %>% 
  filter(vote %in% c("Y", "N"))
m2 <- glm(vote ~ age, data=Chile, family=binomial)

p <- ggpredict(m2, terms="age")
plot(p)

The plot() method for the ggpredict object produces a ggplot.

enter image description here

DaveArmstrong
  • 18,377
  • 2
  • 13
  • 25