I am trying to create a data frame using (either tidyr::expand.grid or tibble::data_frame) in order to then generate posterior predictions using the tidybayes::epred_draws function from tidybayes (akin to posterior_predict). I have three continuous predictors that I could like to vary simultaneously at three set values: 1 standard dev below the mean of each predictor, the mean of each predictor, and 1 standard deviation above the mean of each predictor. The issue I am running into is that I cannot figure out a way to generate values in between the set standard deviation while keeping the structure of the dataset intact.
I created a reproducible example below, as you can see the final posterior prediction doesn't look great. Is there any way to generate additional incremental values in between the set standard deviation and mean?
My go to method would be either be seq() or even modelr::seq_range(data_var_1, pretty=TRUE, n=100), but I'm not sure how to incorporate that in the new dataset in a way that allows me to see what happens the predictors simultaneously shift at once.
Let me know if I can explain anything else.
library(brms)
library(tidybayes)
library(ggplot2)
library(ggthemes)
## create a dataset
data <- tibble(
outcome = rnorm(100, 2, 2),
var_1 = rnorm(100, 5, 2),
var_2 = rnorm(100, 8, 2),
var_3 = rnorm(100, 10, 2)
)
## model the data
m1 <- brms::brm(outcome ~ var_1 + var_2 + var_3, data) # run model (takes a few sec.)
## prepare for predictions with set values
new_data = tibble(
var_1 = c(mean(var_1) - sd(var_1)*1, mean(var_1), mean(var_1) + sd(var_1)*1),
var_2 = c(mean(var_2) - sd(var_2)*1, mean(var_2), mean(var_2) + sd(var_2)*1),
var_3 = c(mean(var_3) - sd(var_3)*1, mean(var_3), mean(var_3) + sd(var_3)*1))
pred_1 <- m1 %>%
tidybayes::epred_draws(new_data)
# generate grand mean posterior predictions (for more on this,
# see: https://www.andrewheiss.com/blog/2021/11/10/ame-bayes-re-guide/)
plot_1 <- ggplot(pred_1, aes(x = var_1, y = .epred)) +
stat_lineribbon() +
scale_fill_brewer(palette = "Reds") +
labs(x = "Shifts in Var 1, 2, and 3", y = "Outcome",
fill = "Credible interval") +
ggthemes::theme_pander() +
theme(legend.position = "bottom") +
scale_x_continuous(limits = c(new_data$var_1[1], new_data$var_1[3]),
breaks=c(new_data$var_1[1],
new_data$var_1[2],
new_data$var_1[3]),
labels = c("-1 SD", "Mean", "+1 SD"))
# visualize posterior predictions (example isn't so pretty, sorry)