I have an experimental predict method for the nestedLogit
package that generates predicted probabilities and their standard errors for
a polytomous response. Both the fitted probabilities and standard errors are
returned as matrices. For plotting, I need to transform each to long format,
and join this to the newdata
data set used in the predictions.
I can sort of do this, but it is clumsy, and my goal is to write a function to do this in general.
Basic example, with three levels of the response partic
, and hincome
, children
:
if(!require(nestedLogit)) install.packages("nestedLogit")
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
data(Womenlf, package="carData")
comparisons <- logits(work=dichotomy("not.work", c("parttime", "fulltime")),
full=dichotomy("parttime", "fulltime"))
wlf.nested <- nestedLogit(partic ~ hincome + children,
dichotomies = comparisons,
data=Womenlf)
# get predicted values for a grid
new <- expand.grid(hincome=seq(0, 45, length=4),
children=c("absent", "present"))
pred.nested <- predict(wlf.nested, newdata = new)
names(pred.nested)
> names(pred.nested)
[1] "p" "logit" "se.p" "se.logit"
(The predict
function also returns logits and their standard errors, but here
I'm just interested in the probabilities, p
, and std errors se.p
Here's what I've done with this example:
# matrices don't work well with tidy processing; these should be data.frames
pred.nested.df <- lapply(pred.nested, as.data.frame)
p.fitted <- pred.nested.df$p |> bind_cols(new)
p.se <- pred.nested.df$se.p
p_long <- p.fitted |>
tidyr::pivot_longer(cols = not.work:fulltime,
names_to = "response",
values_to = "prob")
se_long <- p.fitted |>
tidyr::pivot_longer(cols = not.work:fulltime,
names_to = "response",
values_to = "se.prob")
# put them together
plotprob <- cbind(p_long, se.prob = se_long$se.prob)
I get what I want at the end, in plotprob
> head(p_long)
# A tibble: 6 x 4
hincome children response prob
<dbl> <fct> <chr> <dbl>
1 0 absent not.work 0.208
2 0 absent parttime 0.0237
3 0 absent fulltime 0.768
4 15 absent not.work 0.332
5 15 absent parttime 0.0894
6 15 absent fulltime 0.579
> head(se_long)
# A tibble: 6 x 4
hincome children response se.prob
<dbl> <fct> <chr> <dbl>
1 0 absent not.work 0.208
2 0 absent parttime 0.0237
3 0 absent fulltime 0.768
4 15 absent not.work 0.332
5 15 absent parttime 0.0894
6 15 absent fulltime 0.579
> plotprob <- cbind(p_long, se.prob = se_long$se.prob)
> head(plotprob)
hincome children response prob se.prob
1 0 absent not.work 0.20819668 0.20819668
2 0 absent parttime 0.02371554 0.02371554
3 0 absent fulltime 0.76808778 0.76808778
4 15 absent not.work 0.33154549 0.33154549
5 15 absent parttime 0.08936286 0.08936286
6 15 absent fulltime 0.57909165 0.57909165
Perhaps there is a simpler way? But, in any case, I can't see how to write a function to do this. Say,
make_plotdata <- function(fit, se, newdata) {
}
# call as
plotdata <- make_plotdata(pred.nested.df$p, pred.nested.df$se.p, new)