3

I am writing a function to produce a frequency table using prop.table, and want to produce this for several categorical variables in the data set.

I am using datasets::mtcars for this example. I am looking to write the function with a group_by on the binary variable "am" in the dataset, so the output is stratified by am == 1 and am == 0.

For this code, is there a way to add a group_by statement?

summary(mtcars)

apply(mtcars[c("cyl", "gear", "carb")], 2, 
                    \(x) prop.table(table(x, useNA = "always"))*100)
M--
  • 25,431
  • 8
  • 61
  • 93
four77
  • 49
  • 4

2 Answers2

2

You can use by() in base R:

by(mtcars, mtcars$am, function(mt_am) 
  sapply(mt_am[c("cyl", "gear", "carb")], 
        function(mt_col) prop.table(table(mt_col, useNA = "always"))*100))
#> mtcars$am: 0
#> $cyl
#> mt_col
#>        4        6        8     <NA> 
#> 15.78947 21.05263 63.15789  0.00000 
#> 
#> $gear
#> mt_col
#>        3        4     <NA> 
#> 78.94737 21.05263  0.00000 
#> 
#> $carb
#> mt_col
#>        1        2        3        4     <NA> 
#> 15.78947 31.57895 15.78947 36.84211  0.00000 
#> 
#> ------------------------------------------------------------ 
#> mtcars$am: 1
#> $cyl
#> mt_col
#>        4        6        8     <NA> 
#> 61.53846 23.07692 15.38462  0.00000 
#> 
#> $gear
#> mt_col
#>        4        5     <NA> 
#> 61.53846 38.46154  0.00000 
#> 
#> $carb
#> mt_col
#>         1         2         4         6         8      <NA> 
#> 30.769231 30.769231 23.076923  7.692308  7.692308  0.000000
M--
  • 25,431
  • 8
  • 61
  • 93
-1

Here is a function that can be called with a data set x, a grouping variable and any number of columns. It returns a list of data.frames.

perc_table <- function(x, group, ...) {
  cols <- list(...) |> unlist()
  by(x[cols], x[[group]], \(x) {
    apply(x, 2, \(y) {
      tbl <- prop.table(table(y, useNA = "always")) |> as.data.frame()
      tbl$Freq <- tbl$Freq*100
      tbl
    })
  })
}

perc_table(mtcars, group = "am", "cyl", "gear", "carb")
#> x[[group]]: 0
#> $cyl
#>      y     Freq
#> 1    4 15.78947
#> 2    6 21.05263
#> 3    8 63.15789
#> 4 <NA>  0.00000
#> 
#> $gear
#>      y     Freq
#> 1    3 78.94737
#> 2    4 21.05263
#> 3 <NA>  0.00000
#> 
#> $carb
#>      y     Freq
#> 1    1 15.78947
#> 2    2 31.57895
#> 3    3 15.78947
#> 4    4 36.84211
#> 5 <NA>  0.00000
#> 
#> ------------------------------------------------------------ 
#> x[[group]]: 1
#> $cyl
#>      y     Freq
#> 1    4 61.53846
#> 2    6 23.07692
#> 3    8 15.38462
#> 4 <NA>  0.00000
#> 
#> $gear
#>      y     Freq
#> 1    4 61.53846
#> 2    5 38.46154
#> 3 <NA>  0.00000
#> 
#> $carb
#>      y      Freq
#> 1    1 30.769231
#> 2    2 30.769231
#> 3    4 23.076923
#> 4    6  7.692308
#> 5    8  7.692308
#> 6 <NA>  0.000000

Created on 2023-08-22 with reprex v2.0.2

Rui Barradas
  • 70,273
  • 8
  • 34
  • 66