Issues related to Gamma Regression

Question

I want to derive coefficients of Gamma regression by iterated reweighted method (manually). When I run this code with out for{} loop it works properly but with loop it produce NaN. My code is:

n<-10
y <- rgamma(n, 10, 0.1)
x1 <- rnorm(n, -1,1)
x2 <- rnorm(n, -1,1)
x3 <- rnorm(n, -1,1)

x<-as.matrix(cbind(1,x1,x2,x3))
reg <-glm(y~x1+x2+x3, family=Gamma(link = "inverse"))

### step 1
W<-G<-matrix(0,ncol=length(y),nrow=length(y))
b<-rep(0,4)

for(i in 1:50) {

### step 2
eta<-x%*%b
mu<-pnorm(eta)
diag(G)<-1/dnorm(eta)
z<-eta + G%*%(y - mu)
diag(W)<-(dnorm(eta)^2)/(mu*(1-mu))

### step 3
b <- solve(t(x)%*%W%*%x)%*%t(x)%*%W%*%z

}

Kindly help. My 2nd question is related to glm(). Is there any way which describe that how many iterations has glm() used? Regards.
Updates
with help of this I update this code but its not working.

library(gnlm)
# custom link / inverse 
inv <- function(eta)  -1/(eta)

n<-10
y <- rgamma(n, 10, 0.1)
x1 <- rnorm(n, -1,1)
x2 <- rnorm(n, -1,1)
x3 <- rnorm(n, -1,1)

x<-as.matrix(cbind(1,x1,x2,x3))
reg <-glm(y~x1+x2+x3, family=Gamma(link = "inverse"))
library(gnlm)
reg1<- gnlr(y=y,
     distribution = "gamma",
     mu = ~ inv(beta0 + beta1*x1 + beta2*x2 + beta3*x3),
     pmu = list(beta0=1, beta1=1, beta2=1, beta3=1),
     pshape=0.1
)

I want to derive reg and reg1 same results. Kindly help.

What is your `for()` loop intended to be doing? It's looping over i = 1 to 50 but then never using the value of i. — Paul Stafford Allen, Jan 10 '23 at 11:04
But every iteration of the loop `b` will be overwritten. If you put `b[i] <-` etc then the results of the first loop iteration will be written to b[1], the second to b[2] etc and can be retrieved afterwards. — Paul Stafford Allen, Jan 10 '23 at 11:19
actually this methods repeat until results become similar. So, I think its better to print all iteration results. — J AK, Jan 10 '23 at 11:22
The error appears to be in the formulas rather than with the code. Do you have a reference for the algorithm you used? — jblood94, Jan 10 '23 at 11:32
Hmm. The matrix multiplication `x%*%b` seems to be producing all `NaN`, regardless of whether or not it's in the loop. Neither x nor b contain nan (tested with `is.nan()` and both are numeric. — Paul Stafford Allen, Jan 10 '23 at 11:35
@jblood94 [link](https://www.youtube.com/watch?v=Cd-B5wFSaxw) — J AK, Jan 10 '23 at 13:28
@jblood94 issues is not probit or gamma, problem is about loop. Why same prog is not working? — J AK, Jan 11 '23 at 06:55
The issue certainly is that probit and gamma regression require a different algorithm. Your code would work if you changed `family = Gamma(link = "inverse")` to `family = family = binomial(link = "probit")` and `y` were compatible with probit regression (taking values of `0` and `1`). — jblood94, Jan 11 '23 at 14:03
It might also be useful to know more about the context/long-term goal: are you building GLM by hand for pedagogical purposes, or because you are aiming at some generalized/more flexible model that GLM won't do? Is `my_glmfit` in https://bbolker.github.io/stat4c03/notes/glm_comp.pdf useful? — Ben Bolker, Jan 12 '23 at 17:55

score 1 · Answer 1 · answered Jan 11 '23 at 13:59

1

For the first code chunk, the algorithm is for probit regression, not gamma. To perform the iterations manually using glm's default of no weights and no offset for family = Gamma(link = "inverse"), update the code as follows.

n <- 10
y <- rgamma(n, 10, 0.1)
x1 <- rnorm(n, -1,1)
x2 <- rnorm(n, -1,1)
x3 <- rnorm(n, -1,1)

x <- as.matrix(cbind("(Intercept)" = 1,x1,x2,x3))
reg <- glm(y~x1+x2+x3, family = Gamma(link = "inverse"))

### step 1
eta <- 1/y

for(i in 1:reg$iter) {
  tX <- t(X <- x/eta)
  b <- drop(solve(tX%*%X)%*%tX%*%(2 - y*eta))
  eta <- drop(x %*% b)
}

reg$iter is the number of iterations performed by the glm function. Check that b is equal to the coefficients given by glm:

all.equal(reg$coefficients, b)
#> [1] TRUE

answered Jan 11 '23 at 13:59

jblood94

10,340
1
10
15

Thankx for your answer. I want to know 1 more thing. If I want to repeat this process for some other **link** which will b changed. Because I didn't understand elements of loop. – J AK Jan 11 '23 at 15:13
2

Try stepping through `glm` and `glm.fit`: `debug(glm.fit); debug(glm); glm(y~x1+x2+x3, family=Gamma(link = "identity"))`. – jblood94 Jan 11 '23 at 17:37
Here my query is about `(2 - y*eta)`. Why you use this? Inverse link function is 1/eta. – J AK Jan 12 '23 at 02:24
1

`z = eta + (y - mu)*g'(mu)`; `mu = 1/eta`, `g'(mu) = -1/mu^2`; `z = eta - (y*eta^2 - eta) = 2*eta - y*eta^2`; `w = 1/(V(mu)*(g'(mu)^2) = eta^-2`; `z*sqrt(w) = 2 - y*eta`. (See lines 95-97 in `glm.fit`, noting that `w` on line 96 is what is usually represented as `sqrt(w)`). – jblood94 Jan 12 '23 at 12:58

swihart · Answer 2 · 2023-01-12T17:27:55.000

Your inverse function is negative. Take away the minus sign.
Also, change pshape to 1.0.
I'm setting a seed for reproducibility.
Initial values for small datasets is key. Setting them using glm results is a common approach if you can get a similar enough link. Another approach would be that in the answer by @jblood94. Yet another one would be to use nls() for (rough) initial estimates.
argument trace=TRUE in glm() will show how many iterations

set.seed(111)
library(gnlm)
# custom link / inverse 
inv <- function(eta)  1/(eta)

n<-10
y <- rgamma(n, 10, 0.1)
x1 <- rnorm(n, -1,1)
x2 <- rnorm(n, -1,1)
x3 <- rnorm(n, -1,1)

x<-as.matrix(cbind(1,x1,x2,x3))
reg <-glm(y~x1+x2+x3, family=Gamma(link = "inverse"), trace=TRUE)
library(gnlm)
reg1<- gnlr(y=y,
     distribution = "gamma",
     mu = ~ inv(beta0 + beta1*x1 + beta2*x2 + beta3*x3),
     pmu = c(0.002, -0.002, -0.001, -0.001), ## or set to reg$coeff,
     pshape=1
)

cbind(c(reg$coeff,NA), reg1$coeff)

Which gives:


> cbind(c(reg$coeff,NA), reg1$coeff)
                     [,1]          [,2]
(Intercept)  0.0033899338  0.0033914440
x1          -0.0037481699 -0.0037476263
x2          -0.0007462714 -0.0007463346
x3          -0.0014941431 -0.0014936034
                       NA  2.8592334563

An example of different link and using nls to get starting values:

nls.init3 <-
nls(y ~  beta0 + 1/(beta1+1)*x1 + sqrt(beta2)*x2 + beta3^2*x3,
    data=data.frame(y=y, x1=x1, x2=x2, x3=x3),
    start=list(beta0=1,beta1=.1,beta2=.1,beta3=.1)
    )
summary(nls.init3)$coefficients[,1]

reg3<- gnlr(y=y,
     distribution = "gamma",
     mu = ~  beta0 + 1/(beta1+1)*x1 + sqrt(beta2)*x2 + beta3^2*x3,
     pmu = summary(nls.init3)$coefficients[,1],
     pshape=1
)

reg3$coeff

And another

nls.init4 <-
nls(y ~  exp(beta0 + 1/(beta1+1)*x1),
    data=data.frame(y=y, x1=x1),
    start=list(beta0=0, beta1=0)
    )
summary(nls.init4)$coefficients[,1]

reg4<- gnlr(y=y,
     distribution = "gamma",
     mu = ~  exp(beta0 + 1/(beta1+1)*x1),
     pmu = summary(nls.init4)$coefficients[,1],
     pshape=1
)

reg4$coeff

Issues related to Gamma Regression

2 Answers2

Linked