The short question is: profile() returns 12 parameter values. How can it be made to return a greater number?
The motivation for my question is to reproduce Fig. 1.3 in Applied Logistic Regression 3rd Edition by David W. Hosmer Jr., Stanley Lemeshow and Rodney X. Sturdivant (2009), which plots the profile log likelihood against the coefficient for x = age over the confint() interval.
The glm model was
fit <- glm(chd ~ age, data = chdage, family = binomial(link = "logit"))
which relates the presence or absence of coronary heart disease to age for 100 patients. The results of the model agree with Table 1.3 in the text on p. 10.
For convenience, a csv file of the data is in my gist
Using the guidance provided by Ben Bolker for multiplying MASS::profile's output of deviance by -0.5 to convert to negative log-likelihood in a 2011 post using the tidy function provided by jebyrnes in later comment on the same post.
library(dplyr)
library(MASS)
library(purrr)
get_profile_glm <- function(aglm){
prof <- MASS:::profile.glm(aglm)
disp <- attr(prof,"summary")$dispersion
purrr::imap_dfr(prof, .f = ~data.frame(par = .y,
deviance=.x$z^2*disp+aglm$deviance,
values = as.data.frame(.x$par.vals)[[.y]],
stringsAsFactors = FALSE))
}
pll <- get_profile_glm(fit) %>% filter(par == "age") %>% mutate(beta = values) %>% mutate(pll = deviance * -0.5) %>% select(-c(par,values, deviance))
pll
> pll
beta pll
1 0.04895 -57.70
2 0.06134 -56.16
3 0.07374 -55.02
4 0.08613 -54.25
5 0.09853 -53.81
6 0.11092 -53.68
7 0.12332 -53.80
8 0.13571 -54.17
9 0.14811 -54.74
10 0.16050 -55.49
11 0.17290 -56.41
12 0.18529 -57.47
This can be plotted to obtain an approximation of the HLS figure 1.3 with the ablines for the alpha = 0.95 interval from
confint(fit)
and
logLik(fit)
The asymmetry in the legend can be calculated with
asymmetry <- function(x) {
ci <- confint(x, level = 0.95)
ci_lower <- ci[2,1]
ci_upper <- ci[2,2]
coeff <- x$coefficients[2]
round(100 * ((ci_upper - coeff) - (coeff - ci_lower))/(ci_upper - ci_lower), 2)
}
asym <- assymetry(fit)
The plot is produced with
ggplot(data = pll, aes(x = beta, y = pll)) +
geom_line() +
scale_x_continuous(breaks = c(0.06, 0.08, 0.10, 0.12, 0.14, 0.16)) +
scale_y_continuous(breaks = c(-57, -56, -55, -54)) +
xlab("Coefficient for age") +
ylab("Profile log-likelihood function") +
geom_vline(xintercept = confint(fit)[2,1]) +
geom_vline(xintercept = confint(fit)[2,2]) +
geom_hline(yintercept = (logLik(fit) - (qchisq(0.95, df = 1)/2))) +
theme_classic() +
ggtitle(paste("Asymmetry =", scales::percent(asym/100, accuracy = 0.1))) +
theme(plot.title = element_text(hjust = 0.5))
Two adjustments are needed:
The curve should be smoothed by the addition of beta values and log likelihood values along the x and y axes, respectively.
The range of beta should be set comparably to approximately [0.0575,0.1625] (visually, from the figure). I assume this can be done by subsetting as required.
A note regarding the logLik y intercept on the figure. It appears to be based on a transposed value of log-likelihood. See Table 1-3 at p. 10, where it is given as -53.676546, compared to the equation on p. 19 where it is transposed to -53.6756.