How to create a column of percentages within a grouped dataframe?

Question

I have created a frequency table, DF, using the code below. However I would also like to create a column of percentages/proportions within the table, to see the percentage/proportion of each Function for each key. I am not sure how to adapt my code to do this. Any advice and help would be appreciated!

  gather(key = 'key', value = 'freq', -Function) %>%
  mutate(freq = as.numeric(freq)) %>% 
  group_by(Function, key) %>% 
  summarise(freq=sum(freq)) ```

score 1 · Accepted Answer · answered Jul 21 '20 at 00:10

Try using this :

library(dplyr)
df %>%
  tidyr::gather(key = 'key', value = 'freq', -Function) %>%
  mutate(freq = as.numeric(freq)) %>% 
  group_by(key, Function) %>% 
  summarise(freq=sum(freq)) %>% #..... (1)
  mutate(freq = freq/sum(freq))

Note that -

gather has been retired, so use pivot_longer instead.
The above works without grouping by key explicitly because when you do summarise at (1) only last level of grouping is dropped i.e Function, so data is still grouped by key at (1).

score 0 · Answer 2 · answered Jul 21 '20 at 00:10

If I understood your problem correctly, you can continue by grouping by key and the calculate the percentage/proportion

gather(key = 'key', value = 'freq', -Function) %>%
mutate(freq = as.numeric(freq)) %>% 
group_by(Function, key) %>% 
summarise(freq = sum(freq))  %>% 
group_by(key) %>%
mutate(prop = freq / sum(freq))

How to create a column of percentages within a grouped dataframe?

2 Answers2