1

i'd like to produce nice summaries for a selection of grouping variables in my dataset, where for each group i would show the top 6 frequencies and their associated proportions. I can get this for a single grouping variable using the syntax:

my_db %>% 
group_by(my_var) %>% 
summarise(n=n()) %>% 
mutate(pc=scales::percent(n/sum(n))) %>% 
arrange(desc(n)) %>% 
head()

How do i modify this expression so it can be used in an apply function?

For example using mtcars, I've tried something like this:

apply(mtcars[c(2:4,11)], 2, 
   function(x) {
    group_by(!!x) %>% 
      summarise(n=n()) %>% 
      mutate(pc=scales::percent(n/sum(n))) %>% 
      arrange(desc(n)) %>% head()
      }
    )

but it doesn't work. Any idea how i can achieve this?

chrisjacques
  • 635
  • 1
  • 5
  • 17

2 Answers2

3

You should apply using the colnames(dat) to get the correct groupings:

dat <- mtcars[c(2:4,11)]



grp <- function(x) {
  group_by(dat,!!as.name(x)) %>%
  summarise(n=n()) %>% 
  mutate(pc=scales::percent(n/sum(n))) %>% 
  arrange(desc(n)) %>% head()
}


lapply(colnames(dat), grp)
Chris
  • 3,836
  • 1
  • 16
  • 34
1
apply(mtcars[c(2:4,11)], 2, 
      function(x) { 
    mtcars %>%
    group_by(x= !!x) %>% 
      summarise(n=n()) %>% 
      mutate(pc=scales::percent(n/sum(n))) %>% 
      arrange(desc(n)) %>% head()
  }
)

you just need the parent df to evaluation

A. Suliman
  • 12,923
  • 5
  • 24
  • 37
  • thanks very much for that, so simple... Unfortunately i get this error message when i use the code on my own dataset and i can't work out what the problem is (it's not throwing that error with the second solution given by Chris below): Error in FUN(X[[i]], ...) : variable names are limited to 10000 bytes – chrisjacques Jul 10 '18 at 11:50
  • No worries. Anyhow, the problem was `apply` rename the `group_by` column in the new tibble by the column like so `structure(c(6, 6, 4, 6, 8, 6, 8, 4, 4, 6, 6, 8, 8, 8, 8, 8, 8, \n4,~`. Hence, you have a very long column the variable name failed. – A. Suliman Jul 10 '18 at 12:58