I have been trying to use tbl_svysummary()
to create a summary table in gtsummary
for survey data using weights. I am using the srvyr
package to create the tbl_svy object using as_survey_design()
. Colleagues have had no issues with this formula and are able to knit outputs to HTML documents with no problems.
Unfortunately every time I try to use tbl_svysummary()
, I get the error code Error: $ operator is invalid for atomic vectors
. The problem (from examining the traceback and a lot of trial and error) appears to be that the srvyr
-created tbl_svy objects store the x$call
component of the dataset differently than survey
(more on this below). This happens even when using any empty tbl_svysummary
command (e.g. data %>% tbl_svysummary()
) but I will use an example with a formatted table for better illustration...
I was able to reproduce the problem using example code and data from the survey
vs. srvyr
vignette (https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html) in the "Preparing a survey dataset" section - then trying to plug it into the example tbl_svysummary
code from the Survey Data section of the gtsummary
tutorial (http://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html#survey-data). I add in the both
variable to the survey vs. srvyr code for the sake of easily plugging into the gtsummary tutorial code.
Without further ado... Packages:
library(tidyverse)
library(survey)
library(srvyr)
library(gtsummary)
The following code:
strat_design_survey <- svydesign(~1, strata = ~stype, fpc = ~fpc,
variables = ~stype + api99 + api00 + api.stu + both,
weight = ~pw, data = apistrat)
strat_design_srvyr %>%
tbl_svysummary(
# stratify summary statistics by the "both" column
by = both,
# summarize a subset of the columns
include = c(api00, api99, both),
# adding labels to table
label = list(api00 ~ "API in 2000",
api99 ~ "API in 1999")
) %>%
add_p() %>% # comparing values by "both" column
add_overall() %>%
# adding spanning header
modify_spanning_header(c("stat_1", "stat_2") ~ "**Met Both Targets**")
Produces a perfect-looking gtsummary output
, which you can see here:
Ideal tbl_svysummary output.
While this code:
strat_design_srvyr <- apistrat %>%
as_survey_design(1, strata = stype, fpc = fpc, weight = pw,
variables = c(stype, both, starts_with("api")))
strat_design_srvyr %>%
tbl_svysummary(
# stratify summary statistics by the "both" column
by = both,
# summarize a subset of the columns
include = c(api00, api99, both),
# adding labels to table
label = list(api00 ~ "API in 2000",
api99 ~ "API in 1999")
) %>%
add_p() %>% # comparing values by "both" column
add_overall() %>%
# adding spanning header
modify_spanning_header(c("stat_1", "stat_2") ~ "**Met Both Targets**")
Produces this error: Error: $ operator is invalid for atomic vectors
Looking through the traceback, the error appears to happen when tbl_svysummary()
attempts to execute this function: all.vars(x$call$id)
. Simply typing the following into the console provides the same error: strat_design_srvyr$call$id
, strat_design_srvyr$call$weights
, etc. Of course the equivalent for the tbl_svy generated by svydesign()
does not encounter the same issue (i.e. strat_design_survey$call$weight
). Something appears to be amiss with how srvyr
produces the tbl_svy, which is in turn messing up the subsetting required for tbl_svysummary()
.
And before anyone comments, "why don't you just use svydesign()
?" I am (a) hoping to sort out using srvyr
and as_survey_design()
instead, since the latter seems to work for my colleagues, and (b) having trouble with getting a svydesign()
-generated tbl_svy to work on my actual data and code (which I cannot share due to size and privacy issues), likely partly to due to the fact that it is part of a fairly intricate Rmd code which I would prefer not to need to alter.
Finally, I have already tried updating, uninstalling, and reinstalling all related packages, restarting the R session, ending the session and restarting RStudio entirely, and restarting my computer.