0

As far as I can tell, I have specified this simple GLM the same way using a basic glm function and the caret train function. However, the caret version will not converge. Is there something missing in how I am specifying the train model?


library(caret)

logo <- rast(system.file("ex/logo.tif", package="terra"))   
names(logo) <- c("red", "green", "blue")
p <- matrix(c(48, 48, 48, 53, 50, 46, 54, 70, 84, 85, 74, 84, 95, 85, 
              66, 42, 26, 4, 19, 17, 7, 14, 26, 29, 39, 45, 51, 56, 46, 38, 31, 
              22, 34, 60, 70, 73, 63, 46, 43, 28), ncol=2)

a <- matrix(c(22, 33, 64, 85, 92, 94, 59, 27, 30, 64, 60, 33, 31, 9,
              99, 67, 15, 5, 4, 30, 8, 37, 42, 27, 19, 69, 60, 73, 3, 5, 21,
              37, 52, 70, 74, 9, 13, 4, 17, 47), ncol=2)





xy <- rbind(cbind(1, p), cbind(0, a))

# extract predictor values for points
e <- terra::extract(logo, xy[,2:3])

# combine with response (excluding the ID column)
v <- data.frame(cbind(pa=xy[,1], e))
v$pa <- as.factor(v$pa)

#GLM model

model <- glm(formula=as.numeric(pa)~ red + blue + green , data=v)

#Train model

model2 <- train(pa ~ red + green + blue, 
               data=v,
               method = "glm")

>Warning messages:
> 1: glm.fit: algorithm did not converge
> 2: glm.fit: fitted probabilities numerically 0 or 1 occurred


canderson156
  • 1,045
  • 10
  • 24
  • 3
    ?? `glm()` from base R assumes a Gaussian response ( == inefficient version of `lm()`) by default. Does `method = "glm"` use `family="binomial"` by default? It sure looks like it, because that's the only place those warning messages could come from ... – Ben Bolker Jan 25 '23 at 16:37
  • You can post that as an answer if you'd like. – Ben Bolker Jan 25 '23 at 21:03

1 Answers1

0

Compliments of @Ben Bolker in the comments:

The train model was using family="binomial" as the default because the response variable is 0 and 1. The train model works with family="gaussian".

canderson156
  • 1,045
  • 10
  • 24