1

Apologies for what is a pretty basic question... I am loving using the expss package for table creation, but am having trouble working through some of the output display. Specifically, I have a situation where my data frame contains a grouping variable as well as a few variables that will be summarized. I'd like to create output that displays certain summary statistics for each value of the subgroup in turn (each value of the grouping variable) plus the total for the whole sample. Something like the code below, but appending the output1 and output2 objects together in a single table that maintains the formatting of expss's RStudio Viewer output.

library(expss)

set.seed(12345)

df <- data.frame(group = rep(1:5, each = 4),
                 varA = sample(1:4, 20, replace = TRUE),
                 varB = sample(6:9, 20, replace = TRUE))

output1 <- df[df$group == 1, ] %>%
  tab_cells(varA, varB) %>%
  tab_cols(total(label = "")) %>% 
  tab_stat_fun("Valid N" = w_n, "Mean" = w_mean, "SD" = w_sd,
               "Median" = w_median, method = list) %>% 
  tab_pivot() %>% 
  set_caption("Group 1")

output2 <- df %>%
  tab_cells(varA, varB) %>%
  tab_cols(total(label = "")) %>% 
  tab_stat_fun("Valid N" = w_n, "Mean" = w_mean, "SD" = w_sd,
               "Median" = w_median, method = list) %>% 
  tab_pivot() %>% 
  set_caption("All Groups")

expss_output_viewer()

output1
output2

I know that I can add tab_rows(group) to the piping which will display all of the groups; however, I am only interested in displaying each group in turn (plus the total), not all groups, for output.

Keith Burt
  • 33
  • 5

1 Answers1

1

There are special function for subgroups: tab_subgroup:

library(expss)

set.seed(12345)

df <- data.frame(group = rep(1:5, each = 4),
                 varA = sample(1:4, 20, replace = TRUE),
                 varB = sample(6:9, 20, replace = TRUE))

output <- df %>%
    tab_cells(varA, varB) %>%
    tab_cols(total(label = "")) %>% 
    tab_subgroup(group == 1) %>% 
    tab_row_label("Group 1") %>% 
    tab_stat_fun("Valid N" = w_n, "Mean" = w_mean, "SD" = w_sd,
                 "Median" = w_median, method = list) %>% 
    tab_row_label("All Groups") %>% 
    tab_subgroup() %>% 
    tab_stat_fun("Valid N" = w_n, "Mean" = w_mean, "SD" = w_sd,
                 "Median" = w_median, method = list) %>% 
    tab_pivot()

expss_output_viewer()

output

Alternatively, you can use tab_rows and net:

library(expss)

set.seed(12345)

df <- data.frame(group = rep(1:5, each = 4),
                 varA = sample(1:4, 20, replace = TRUE),
                 varB = sample(6:9, 20, replace = TRUE))
output <- df %>%
    tab_cells(varA, varB) %>%
    tab_cols(total(label = "")) %>% 
    tab_rows(net(group, "Group 1" = 1,  "All Groups" = 1:5, position = "above")) %>%
    tab_stat_fun("Valid N" = w_n, "Mean" = w_mean, "SD" = w_sd,
                 "Median" = w_median, method = list) %>% 
    tab_pivot()

expss_output_viewer()

output
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20