Hello to R/expss experts! This is a follow-up question to this one --> Complex tables with expss package.
I added subtotals to already complex tables using the excellent expss package, and it works well for most tasks (counts, proportions, means...). Yet, I found out statistical tests evaluation differ between one table without subtotals and the exact same with subtotals. @Gregory Demin, your knowledge would be greatly appreciated :)
An example to illustrate my words, using the infert dataset available in the datasets package:
example <- infert %>%
tab_significance_options(sig_level=0.2, keep="none", sig_labels=NULL, subtable_marks="greater", mode="append") %>%
tab_cols(total(), education) %>%
tab_cells(parity) %>%
# block for cases
tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
# block for percent statistic - Subtable tests
tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
tab_last_sig_cpct(label="T.1", compare_type="subtable") %>%
# block for percent statistic - First column tests
tab_stat_cpct(label="T.2", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
tab_last_sig_cpct(compare_type="first_column", mode="replace") %>%
tab_pivot(stat_position="inside_columns") %>%
# converts NA to zero
recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy)
example <- example[,-c(4,5)]
print(example)
Note: sig_level is very high (20%) to illustrate this specific issue, do not panic :) This is the starting point and I am fine with that. Then we only add the subtotals (line 5):
example2 <- infert %>%
tab_significance_options(sig_level=0.2, keep="none", sig_labels=NULL, subtable_marks="greater", mode="append") %>%
tab_cols(total(), education) %>%
tab_cells(parity) %>%
tab_subtotal_cells("#FIRST 3"=c(1,2,3),"#LAST 3"=c(4,5,6), position = "above") %>%
# block for cases
tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
# block for percent statistic - Subtable tests
tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
tab_last_sig_cpct(label="T.1", compare_type="subtable") %>%
# block for percent statistic - First column tests
tab_stat_cpct(label="T.2", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
tab_last_add_sig_labels() %>%
tab_last_sig_cpct(compare_type="first_column", mode="replace") %>%
tab_pivot(stat_position="inside_columns") %>%
# converts NA to zero
recode(as.criterion(is.numeric) & is.na ~ 0, TRUE ~ copy)
example2 <- example2[,-c(4,5)]
print(example2)
I do not know what is happening, but the results of significance tests are not the same this time. Besides, I feel no significance test is calculated on the two subtotal rows. Any insight?