1

I'd be grateful if someone could tell me why the following is happening and how to correct it.

I'm using the expss package to create a table as follows:

table <- dta %>%
        tab_cells(dta[["x"]]) %>%
        tab_rows(factor(dta[["y"]], ordered=TRUE)) %>%
        tab_weight(dta[["weight"]]) %>%
        tab_stat_cpct(total_statistic = "w_cpct") %>%
        tab_pivot() %>%
        split_columns()

I put factor(dta[[y]], ordered=TRUE) so that the factor is ordered in the table. With my other variables this has worked but somehow not with this one.

If I only enter factor(dta[[y]], ordered=TRUE) into the console it returns correctly

Levels: 537 < 564 < 650 < 1010

However, if I use the above function to create a data table, then for whatever reason it's ordered as follows:

1010 537 564 650

What can I do so that it's in the correct order?

This is a sample dataset to re-create the problem:

dta <- data.frame(x = c(1,1,1,2,1,1,1,1,1,1,1,2,1,2,2,2,1,1,2,2),
                  y = c(1010,650,650,537,650,650,650,650,564,650,650,650,564,564,564,564,650,650,564,564),
                  weight = c(42.066290,3.126177,3.808385,4.812877,8.093253,1.559941,6.168395,2.419531,3.937412,4.293246,20.445602,16.504405,1.314727,2.474295,2.274015,2.668155,3.864480,2.521209,2.605202,2.194348))

Thanks a lot in advance!

R. Isabel
  • 51
  • 7
  • looks like it's ordering it alphabetically by first digit rather than by numeric value. Does `tab_rows()` convert to character type or something? – Paul Stafford Allen Jul 21 '23 at 15:36

1 Answers1

0

Yes, it's a bug in expss. You can use sorting workaround, wich reorder table according to numeric values:

sort_workaround = function(tbl){
    separated_labels = as.data.frame(split_labels(tbl[[1]], remove_repeated = FALSE))
    # [,-ncol(separated_labels)] to keep total position 
    separated_labels = type.convert(separated_labels, as.is = TRUE)[,-ncol(separated_labels)]
    new_order = do.call(order, separated_labels)
    tbl[new_order, ]
}

table <- dta %>%
    tab_cells(x) %>%
    tab_rows(factor(y, ordered=TRUE)) %>%
    tab_weight(weight) %>%
    tab_stat_cpct(total_statistic = "w_cpct") %>%
    tab_pivot() %>% 
    sort_workaround() %>%
    split_columns()


table
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20