2
# Set data sets
data <- MASS::birthwt
X <- as.matrix(cbind(1, data[, -1]))
Y <- data[, 1]
n <- dim(X)[1]
p <- dim(X)[2]    
beta <- rep(0, p)  # Initialize beta
beta1 <- rep(1, p)  # Initialize beta1


# fit logit regression using Fisher scoring
while (sum(abs(beta - beta1))  > 0.01) {
  beta1 <-beta

  # Calculate mu and eta
  eta <- X %*% beta1
  mu <- exp(eta) / (1 + exp(eta))

  # Calculate the derivatives
  dmu_deta <- exp(eta) / (( 1 + exp(eta)) ^ 2)
  deta_dmu <- 1 / mu + 1 / (1 - mu)

  V <- mu * (1 - mu)  # Calculate the variance function
  W <- diag((dmu_deta ^ 2 / V)[,])  # Calculate the weight matrix
  Z <- eta + (Y - mu) * deta_dmu  # Calculate Z

  beta <- solve(t(X) %*% W %*% X) %*% (t(X) %*% W %*% Z)
}

I am using Iteratively Reweighted Least Square method. The X and Y come from the built-in dataset birthwt. I do not understand why this method does not converge. It always returns a NaN. But when I remove the intercept, it converges. I know that I can simply use glm, but I would like to understand the implementation.

Sabuncu
  • 5,095
  • 5
  • 55
  • 89
Xingzhe He
  • 21
  • 2
  • 1
    Seems like `glm` don't converge either. – F. Privé Sep 20 '17 at 21:20
  • 1
    Can you provide where you get the formula for the IRLS? I've not `dmu_deta` nor `deta_dmu` in the one I use. Although it doesn't converge either. – F. Privé Sep 20 '17 at 21:24
  • @F.Privé [link](http://www.ciencias.unal.edu.co/unciencias/data-file/user_35/file/Modelos%20Lineales%20Generalizados%20-%20Clarice%20Demetrio.pdf) On the 23rd page. My professor doesn't like to write a powerpoint so I just picked one on the internet. The dmu_deta is the partial derivative mu to eta. I think glm also converges. At least I get some numbers rather than NaN. – Xingzhe He Sep 20 '17 at 23:16
  • 1
    See page 121(140) of [this PDF](https://web.stanford.edu/~hastie/Papers/ESLII.pdf). – F. Privé Sep 21 '17 at 07:10
  • @F.Privé It turns out that my data pre-prosessing has some problems. Thank you anyway! glm did not work actually, but I am curious why R can still produce the answer. – Xingzhe He Sep 22 '17 at 08:15
  • 1
    I think `glm` stops before going nuts (returns the iteration before). And it gives you a warning to say that there was a convergence problem. Can you make an answer giving the reason why this wasn't working? – F. Privé Sep 22 '17 at 10:10
  • @F.Privé I think it is because I treated non-numeric variable as a numeric variable.Just :race <- factor(race, label=c("white", "black", "other")) – Xingzhe He Sep 22 '17 at 13:55

0 Answers0