I have a pipeline in targets that utilizes the tbl_svysummary function from the gtsummary package to generate a summary statistics table. When I run the individual functions within the pipeline, loading the necessary objects into the global environment, the tbl_svysummary function works perfectly fine. However, when I run the entire pipeline, this particular function throws an error.
The error message I receive is as follows:
Error in statistic argument input. Select from age, age_group, sex, hiv, orphan, sp_non_gov, sp_gov, sp_any, edu_enrol, edu_attainment, 'edu_ecostr_work', 'srh_condom_use', 'srh_multiple_partners', 'srh_transactional', 'srh_child_marriage'.
Here's the code for the do_table_summary_stats function:
#' Generate Summary Statistics Table with Weighted % and 95% CIs
#'
#' @param control_var A vector of outcome variables.
#' @param outcome_var A vector of outcome variables.
#' @param pred_var A vector of predictor variables.
#' @param design `svydesign` object representing the survey design.
#'
#' @return
#' @export table_1.docx
#'
#' @examples
do_table_summary_stats <- function(outcome_var, pred_var, control_var, design) {
t <- gtsummary::tbl_svysummary(
data = design,
by = sex,
missing = "no",
include = c(age, age_group, control_var, pred_var, outcome_var),
statistic = list(age ~ "{mean} ({sd})", all_categorical() ~ "{p}%"),
type = age ~ "continuous",
label = age ~ "Age"
) %>%
gtsummary::add_p() %>%
gtsummary::separate_p_footnotes() %>%
gtsummary::add_significance_stars() %>%
gtsummary::add_ci(include = all_categorical()) %>%
gtsummary::add_overall() %>%
gtsummary::modify_header(
label = "**Characteristic**",
stat_2 = "**Girls** (N=2,967)",
stat_1 = "**Boys** (N=627)",
stat_0 = "**Total** (N=3,594)",
p.value = "**p-value**"
) %>%
gtsummary::modify_column_merge(
pattern = "{stat_2} [{ci_stat_2}]",
rows = NULL
) %>%
gtsummary::modify_column_merge(
pattern = "{stat_1} [{ci_stat_1}]",
rows = NULL
)
# Save to a docx file
t %>%
as_flex_table() %>%
save_as_docx(path = file.path(config$outdir_lso, config$outdat_t1))
}
Objects used in the function:
outcome_var <- c(
"edu_enrol",
"edu_attainment",
"edu_ecostr_work",
"srh_condom_use",
"srh_multiple_partners",
"srh_transactional",
"srh_child_marriage"
)
pred_var <- c(
"sp_non_gov",
"sp_gov",
"sp_any"
)
control_var <- c(
"age",
"sex",
"hiv",
"orphan"
)
design <- survey::svydesign(
id = ~ psu,
strata = ~ district,
weights = ~ individual_weight,
data = vacs_lso,
single = "centered"
)
The do_table_summary_stats function takes the following parameters:
outcome_var: A vector of outcome variables.
pred_var: A vector of predictor variables.
control_var: A vector of control variables.
design: A svydesign object representing the survey design.
To troubleshoot the issue, I have already checked the availability and loading of objects into the global environment, ensured the correct type and format of variables, and verified that the variable names used in the function call match the column names in the dataset. However, I still encounter the aforementioned error when running the pipeline as a whole.
Any help or suggestions on resolving this issue would be greatly appreciated. Thank you in advance!
I tried changing the arguments in the
statistic = list(age ~ "{mean} ({sd})", all_categorical() ~ "{p}%")
Also tried running outside of the pipeline, which worked perfectly.