I am trying to understand what are the "ld" residuals that are produced from running the residual function on a survreg model?
For example
library(survival)
mod <- survreg(Surv(time, status -1) ~ age , data = lung)
residuals( mod , "ldcase")
residuals( mod, "ldshape")
residuals( mod , "ldresp")
The documentation for the residual function says the following:
Diagnostics based on these quantities are discussed in an article by Escobar and Meeker. The main ones are the likelihood displacement residuals for perturbation of a case weight (ldcase), the response value (ldresp), and the shape.
References
Escobar, L. A. and Meeker, W. Q. (1992). Assessing influence in regression analysis with censored data. Biometrics 48, 507-528.
Taking case weight "ldcase" in particular my understanding from the referenced paper is that these residuals represent an estimate for double the difference in log-likelihood between the original model and the same model fitted by setting subjects i's weight to 2.
However when I attempt to manually code this myself my derived values seem to have no relationship at all to the values produced by the residual function (fully reproducible example below):
library(survival)
library(ggplot2)
mod <- survreg( Surv(time, status -1) ~ age , data = lung)
get_ld <- function(i, mod){
weight <- rep(1 , nrow(lung))
weight[i] <- 2
modw <- survreg(
Surv(time, status -1) ~ age ,
data = lung ,
weights = weight
)
2 * as.numeric(logLik(mod) - logLik(modw))
}
dat <- data.frame(
ld = sapply( 1:nrow(lung), get_ld , mod = mod),
ld_est = residuals(mod , "ldcase")
)
ggplot( data = dat , aes( x = ld_est , y = ld)) + geom_point()
Additionally from the paper these residuals are supposed to be distributed with a 2 * chisq ( p + 2 ) distribution which in this case with p = 1 gives a 1-sided 95% cutoff point of 15.62 which would imply that my manually derived residuals are at least on the correct scale which makes me very confused as to what the residuals returned by "ldcase" actually are ?