3

I have data such as this:

dat <- mtcars %>% mutate(cyl2 = cyl*2,cyl3 = cyl*3)

I would like to run each of the following cross tabs [vs,cyl] [vs,cyl1] [vs,cyl2] [vs,cyl3] using tabyl:

I know that I can run vs, cyl such as this, and repeat this operation for each of the 'cyl' variable. However I would like to form some kind of loop instead of repeating this.

dat%>% 
  tabyl(vs,cyl)%>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(digits = 2) %>%
  adorn_ns()

So I worked on a function:

run_xtable <- function(data,v1) {

  out <- data%>% 
  tabyl(vs,v1)%>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(digits = 2) %>%
  adorn_ns()
  return(out)
}

run_xtable(dat,'cyl')

I have run into some issues, any help is much appreciated!!

  1. The function is not accepting v1 as a reference variable. Why is this? I tried wrapping it in enquo, but no difference was made.

    Error: Must group by variables found in .data.* Column v1 is not found.

  2. How do I set this up so that I can use something like this to reduce repetition:

    sapply(run_xtable, c('cyl','cyl1','cyl2'))

Thank you!

Matt
  • 7,255
  • 2
  • 12
  • 34
NewBee
  • 990
  • 1
  • 7
  • 26

1 Answers1

5

We can convert the input string for v1 to symbol and evaluate (!!)

run_xtable <- function(data,v1) {

  out <- data%>% 
     tabyl(vs, !! rlang::sym(v1))%>%
     adorn_percentages("row") %>%
     adorn_pct_formatting(digits = 2) %>%
     adorn_ns()
  return(out)
 }

-testing

run_xtable(dat,'cyl')
# vs           4          6           8
#  0  5.56%  (1) 16.67% (3) 77.78% (14)
#  1 71.43% (10) 28.57% (4)  0.00%  (0)

and for multiple columns, loop over the column names i.e. v1

lapply(c('cyl','cyl2','cyl3'), run_xtable, data = dat)
#[[1]]
# vs           4          6           8
#  0  5.56%  (1) 16.67% (3) 77.78% (14)
#  1 71.43% (10) 28.57% (4)  0.00%  (0)

#[[2]]
# vs         12          16           8
#  0 16.67% (3) 77.78% (14)  5.56%  (1)
#  1 28.57% (4)  0.00%  (0) 71.43% (10)

#[[3]]
# vs          12         18          24
#  0  5.56%  (1) 16.67% (3) 77.78% (14)
#  1 71.43% (10) 28.57% (4)  0.00%  (0)

Or if we want a single data output with a column as identifier

library(purrr)
library(dplyr)
imap_dfr(lst('cyl','cyl2','cyl3'), ~ run_xtable(data = dat, v1 = .x) %>%
         mutate(grp = .y, .before = 1))
akrun
  • 874,273
  • 37
  • 540
  • 662