3

I have been trying to use tbl_svysummary() to create a summary table in gtsummary for survey data using weights. I am using the srvyr package to create the tbl_svy object using as_survey_design(). Colleagues have had no issues with this formula and are able to knit outputs to HTML documents with no problems.

Unfortunately every time I try to use tbl_svysummary(), I get the error code Error: $ operator is invalid for atomic vectors. The problem (from examining the traceback and a lot of trial and error) appears to be that the srvyr-created tbl_svy objects store the x$call component of the dataset differently than survey (more on this below). This happens even when using any empty tbl_svysummary command (e.g. data %>% tbl_svysummary()) but I will use an example with a formatted table for better illustration...

I was able to reproduce the problem using example code and data from the survey vs. srvyr vignette (https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html) in the "Preparing a survey dataset" section - then trying to plug it into the example tbl_svysummary code from the Survey Data section of the gtsummary tutorial (http://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html#survey-data). I add in the both variable to the survey vs. srvyr code for the sake of easily plugging into the gtsummary tutorial code.

Without further ado... Packages:

library(tidyverse)
library(survey)
library(srvyr)
library(gtsummary)

The following code:

strat_design_survey <- svydesign(~1, strata = ~stype, fpc = ~fpc,
                                 variables = ~stype + api99 + api00 + api.stu + both,
                                 weight = ~pw, data = apistrat)

strat_design_srvyr %>%
  tbl_svysummary(
    # stratify summary statistics by the "both" column
    by = both, 
    # summarize a subset of the columns
    include = c(api00, api99, both),
    # adding labels to table
    label = list(api00 ~ "API in 2000",
                 api99 ~ "API in 1999")
  ) %>%
  add_p() %>%   # comparing values by "both" column
  add_overall() %>%
  # adding spanning header
  modify_spanning_header(c("stat_1", "stat_2") ~ "**Met Both Targets**")

Produces a perfect-looking gtsummary output, which you can see here: Ideal tbl_svysummary output.

While this code:

strat_design_srvyr <- apistrat %>%
  as_survey_design(1, strata = stype, fpc = fpc, weight = pw,
                variables = c(stype, both, starts_with("api"))) 

strat_design_srvyr %>%
  tbl_svysummary(
    # stratify summary statistics by the "both" column
    by = both, 
    # summarize a subset of the columns
    include = c(api00, api99, both),
    # adding labels to table
    label = list(api00 ~ "API in 2000",
                 api99 ~ "API in 1999")
  ) %>%
  add_p() %>%   # comparing values by "both" column
  add_overall() %>%
  # adding spanning header
  modify_spanning_header(c("stat_1", "stat_2") ~ "**Met Both Targets**")

Produces this error: Error: $ operator is invalid for atomic vectors

Looking through the traceback, the error appears to happen when tbl_svysummary() attempts to execute this function: all.vars(x$call$id). Simply typing the following into the console provides the same error: strat_design_srvyr$call$id, strat_design_srvyr$call$weights, etc. Of course the equivalent for the tbl_svy generated by svydesign() does not encounter the same issue (i.e. strat_design_survey$call$weight). Something appears to be amiss with how srvyr produces the tbl_svy, which is in turn messing up the subsetting required for tbl_svysummary().

And before anyone comments, "why don't you just use svydesign()?" I am (a) hoping to sort out using srvyr and as_survey_design() instead, since the latter seems to work for my colleagues, and (b) having trouble with getting a svydesign()-generated tbl_svy to work on my actual data and code (which I cannot share due to size and privacy issues), likely partly to due to the fact that it is part of a fairly intricate Rmd code which I would prefer not to need to alter.

Finally, I have already tried updating, uninstalling, and reinstalling all related packages, restarting the R session, ending the session and restarting RStudio entirely, and restarting my computer.

Chris M.
  • 31
  • 2
  • Hi Chris, The tbl_svysummary function was written for use with the survey package. If you'd like to request support for survey objects created with the srvyr package, I suggest you make the feature request on the gtsummary GitHub page. – Daniel D. Sjoberg May 03 '21 at 13:48
  • Thanks for the quick reply! Will make the request. – Chris M. May 03 '21 at 13:51
  • For anyone interested - the feature request and subsequent upgrade is here: https://github.com/ddsjoberg/gtsummary/issues/886 Thanks again Daniel and Joseph! – Chris M. May 11 '21 at 09:28

0 Answers0