1

I would like to build tables of cumulative percentages using the expss package, including both ascending (0% -> 100%) and descending (100% -> 0%) orders. There is already an existing function (namely fre()) for ascending order, although the resulting table is not much customizable.

I would like to include these calculations inside a tab_stat_fun instruction, and managed to get to the desired output for unweighted datasets. Consider the following example (infert dataset):

infert %>%
  tab_cells(age) %>%
  tab_cols(total()) %>%
  tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>%
  tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
  tab_stat_fun(label="% Asc.", function(x){100*cumsum(table(sort(x)))/sum(table(sort(x)))}) %>%
  tab_stat_fun(label="% Desc.", function(x){100-(100*cumsum(table(sort(x)))/sum(table(sort(x))))}) %>%
  tab_pivot(stat_position="inside_columns")

Works great, but if I ever want to weigh those results by a numeric vector (for the sake of demonstration: infert$w <- as.vector(x=rep(2, times=nrow(infert)), mode='numeric')), this will inevitably lead to an error since neither sum nor cumsum accept weights argument (as far as I know).

Is there a special built-in function that would do the trick? Or a combination of functions that may imply multiplying the age vector by the weight vector?

Maxence Dum.
  • 121
  • 1
  • 9

1 Answers1

1

There is no such ready-made function. However we can utilize your approach and just replace base::table with base::xtabs. The latter can calculate weighted frequency:

library(expss)
data(infert)
infert$w <- as.vector(x=rep(2, times=nrow(infert)), mode='numeric')

cumpercent = function(x, weight = NULL){
    if(is.null(weight)) weight = rep(1, length(x))
    counts = xtabs(weight ~ x)
    100*cumsum(counts)/sum(counts)    
}

infert %>%
    tab_cells(age) %>%
    tab_cols(total()) %>%
    tab_weight(w) %>% 
    tab_stat_cases(label="N", total_row_position="above", total_statistic="u_cases", total_label="TOTAL") %>%
    tab_stat_cpct(label="%Col.", total_row_position="above", total_statistic="u_cpct", total_label="TOTAL") %>%
    tab_stat_fun(label="% Asc.", cumpercent) %>%
    tab_stat_fun(label="% Desc.", function(x, weight = NULL){100-cumpercent(x, weight)}) %>%
    tab_pivot(stat_position="inside_columns")
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • 1
    Did not know about `xtabs`, only looked for weighted sum and cumsum on the forum... Nice approach, works perfectly as always, thanks @Gregory! – Maxence Dum. Apr 23 '20 at 06:35