1

A special request this time since a know how to get to my desired table output but would like to know if a less-wordy solution exists with expss. First off, this topic can be considered an extension of this discussion --> Complex tables with expss package, and is also related to this other one --> How to display results from only select subgroups + the whole data frame in an expss table?

My table construct is the following: showing results on total dataframe rows first, then split by subgroups. As of today, below is how I proceed (example with infert dataset):

1) Table template

### Banner set up
my_banner = infert %>%
  tab_cols(total())
my_custom_table = . %>%  
  tab_significance_options(sig_level=0.2, keep="none", sig_labels=NULL, subtable_marks="greater", mode="append") %>%
  tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>% 
  tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
  # Parity x Education
  tab_cols(education) %>%
  tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>% 
  tab_last_add_sig_labels() %>%
  tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
  tab_last_add_sig_labels() %>%
  tab_last_sig_cpct(label="T.1", compare_type="subtable")

2) Creation of 3 distinct tables (1 for total and 1 for each subgroup), merged into one:

tab1 <- my_banner %>%
  tab_cells(parity) %>%
  my_custom_table() %>%
  tab_pivot(stat_position="inside_columns")
tab2 <- infert %>%
  apply_labels(education="education (CASE 0)") %>%
  tab_cells(parity) %>%
  tab_cols(total(label = "CASE 0")) %>%
  tab_subgroup(case==0) %>%
  my_custom_table() %>%
  tab_pivot(stat_position="inside_columns")
tab3 <- infert %>%
  apply_labels(education="education (CASE 1)") %>%
  tab_cells(parity) %>%
  tab_cols(total(label = "CASE 1")) %>%
  tab_subgroup(case==1) %>%
  my_custom_table() %>%
  tab_pivot(stat_position="inside_columns")

final_tab <- tab1 %merge% tab2 %merge% tab3

All this piece of code only for 1 table, you understand my concern. Any good practice tip to avoid this lengthy (yet working) sequence? My first guess was:

my_banner %>%
  tab_cells(parity) %>%
  my_custom_table() %>%
  tab_subgroup(case==0) %>%
  my_custom_table() %>%
  tab_subgroup(case==1) %>%
  my_custom_table() %>%
  tab_pivot(stat_position="inside_columns")

A table is computed but the output is nowhere near the objective, there is probably a fix but I have no idea where to look for. Any help would be appreciated, thank you! (Note: if a simple solution involves getting rid of #TOTAL columns, it's also fine to me)

Maxence Dum.
  • 121
  • 1
  • 9

1 Answers1

2

The key idea is to use %nest% in the tab_cols instead of tab_subgroup:

library(expss)
data(infert)
my_banner = infert %>%
    apply_labels(
        education = "education",
        case = c(
            "CASE 0" = 0,
            "CASE 1" = 1
        )
    ) %>% 
    tab_cols(total(), education, case %nest% list(total(label = ""), education))

my_custom_table = . %>%  
    tab_significance_options(sig_level=0.2, keep="none", sig_labels=NULL, subtable_marks="greater", mode="append") %>%
    tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>% 
    tab_last_add_sig_labels() %>%
    tab_stat_cpct(label="%Col.",
                  total_row_position="above", 
                  total_statistic=c("u_cases", "u_cpct"), 
                  total_label=c("TO_DELETE_TOTAL", "TOTAL")) %>%
    tab_last_add_sig_labels() %>%
    tab_last_sig_cpct(label="T.1", compare_type="subtable") %>% 
    tab_pivot(stat_position="inside_columns") %>% 
    # drop auxilary rows and columns
    where(!grepl("TO_DELETE", row_labels)) %>% 
    except(fixed("Total|T.1"), fixed("CASE 0|T.1"), fixed("CASE 1|T.1"))

my_banner %>% 
    tab_cells(parity) %>% 
    my_custom_table()
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • Hopefully you are always have the answer! This works great, tahnks @Gregory Demin. One step further I wonder if there is a possibility to subset factors with this method? I tried replacing the existing case= sequence by this one: `case = c("CASE 0"=as.logical(infert$case == levels(infert$case)[1]), "CASE 1"=as.logical(infert$case == levels(infert$case)[2]))` The table is created but a lot of warning messages are displayed as well, so I guess this is not the way to go. The final point being to nest factor subtotals instead of single levels. – Maxence Dum. Apr 14 '20 at 18:34
  • @MaxenceDum. `apply_labels` isn't subsetting. It's just a method to display 'CASE 0'/'CASE 1' instead of plain 0/1. All actions occur here: `tab_cols(total(), education, case %nest% list(total(label = ""), education))`. You can completely remove this code from the `apply_labels` if you have variable with good looking values, e. g. factor with human-readable levels. – Gregory Demin Apr 14 '20 at 20:25
  • Alright, I got confused but it's clearer. However, since the criterion is a logical vector, is there a way to calculate only among TRUE values? From example above, after case was coerced to factor: `tab_cols(set_var_lab((case == levels(case)[1]),"CASE 0") %nest% list(unvr(education)))` With such piece of code both TRUE and FALSE subtables are comptuted. We can `exclude` the FALSE table afterwards of course, but 1) this means extra lines of code and 2) that would same computation time. – Maxence Dum. Apr 15 '20 at 06:55
  • @MaxenceDum. You can select data with `tab_subgroup(case == "desired_subgroup"). But nevertheless empty categories will appear in the table. There are special function to avoid them: `drop_empty_columns`. The latter should be applied after `tab_pivot`. – Gregory Demin Apr 15 '20 at 19:27
  • Thanks for your feedback àGregory Demin, at least I know it is not possible. I managed with my own dataset using the `%nest%` selection as you explained, coupled with a dummy factor with merged levels specifically created for this need, works great thanks for the tip! – Maxence Dum. Apr 17 '20 at 17:47