1

I'm trying to create a set of crosstabs in R using the tab_cells command in the expss package that shows me counts, the total, and my NAs. I can't get it to give me NAs.

I've tried using na_if and tab_mis_val and I've tried doing it using the cro command. I've found a frequency table that I really like using fre and I want to replicate it basically as a crosstab. I've also used tabyl from the janitor package and can get the NA row but I can only run one crosstab at a time instead of saying from var1 to var10.

#I feel like I'm close with this

data%>%  
  tab_cells(var1 %to% var10) %>%  
  tab_cols(total(), var12) %>%  
  tab_stat_cpct() %>% 
  tab_mis_val() %>% 
  tab_pivot() 

#frequency table I really like that DOES give me NAs
expss_output_viewer()
calculate(data, fre(as.list(var1 %to% var10)))

#attempt to make it as a crosstab

expss_output_viewer()
calculate(data, cro_cases(as.list(var1 %to% var10, var12)))


#Using tabyl I can get NAs but it will only give one one crosstab at a 
# time instead of a` whole set of them.
library(janitor)

data %>% 
tabyl(var1, var12) %>% 
adorn_totals(c("row", "col")) %>% 
adorn_percentages("row") %>% 
adorn_pct_formatting() %>% 
adorn_ns() %>% 
adorn_title("combined") %>% 
knitr::kable() 

I want a table with counts, percents, a total row, and an NA row. I cannot seem to get the NA row with tab_cells.

MDEWITT
  • 2,338
  • 2
  • 12
  • 23
Carley
  • 63
  • 6

1 Answers1

0

if_na should do the trick:

data%>%  
  tab_cells(
    add_val_lab(if_na(var1 %to% var10, 999), c("<NA>" = 999))
  ) %>%  
  tab_cols(total(), var12) %>%  
  tab_stat_cpct() %>% 
  tab_pivot() 

Yes, it is rather verbose and I am going to simplify it in the future expss versions. If you don't use value labels you can drop add_val_lab function. if_na(var1 %to% var10, '<NA>') will be enough. UPDATE 21.05

Code with percent under total:

library(expss)
data(mtcars)
mtcars = apply_labels(mtcars,
                      mpg = "Miles/(US) gallon",
                      cyl = "Number of cylinders",
                      disp = "Displacement (cu.in.)",
                      hp = "Gross horsepower",
                      drat = "Rear axle ratio",
                      wt = "Weight (lb/1000)",
                      qsec = "1/4 mile time",
                      vs = "Engine",
                      vs = c("V-engine" = 0,
                             "Straight engine" = 1),
                      am = "Transmission",
                      am = c("Automatic" = 0,
                             "Manual"=1),
                      gear = "Number of forward gears",
                      carb = "Number of carburetors"
)


# create function which calculate single table with NA percent under the total
tab_with_na = function(x, weight = NULL) {
    rbind(
        cro_cpct(
            list(unvr(x)),  # list(unvr()) to completely drop variable labels
            col_vars = "|",
            weight = weight), 
        # calculate percent of NA
        cro_mean(
            list("<NA>" = unvr(is.na(x)*100)),  # list(unvr()) to completely drop variable labels
            col_vars = "|",
            weight = weight)

    )
}

mtcars %>%  
    tab_cells(am %to% carb) %>%  
    tab_cols(total(), vs) %>% 
    tab_stat_fun(tab_with_na) %>% 
    tab_pivot()  
Gregory Demin
  • 4,596
  • 2
  • 20
  • 20
  • Thank you! I'll give it a try! – Carley May 20 '19 at 13:47
  • Okay, so I'm getting my NA row which is great, can I have it calculate my total row so that it is the total number of valid responses not including the NAs. So I would have the # total cases row above the NA row? – Carley May 20 '19 at 16:28