1

I have universities that have won Nobel prizes listed. There are two columns, one for the university, and the other one for the type of prize (physics, literature,..) If a university has more than one prize, it will appear as many times followed by the category of that won prize.

uni_data <- nobel %>% 
  group_by(name_of_university) %>%
  summarise(category)
  

How can I compute the distribution of prizes as a percentage? For example, if Duck University has won 4 prizes. 3 in literature and 1 in physics, how could I compute in a separate column the distribution of it? In this case, it would be 75% for literature and 25% for physics.

I want to do this with all universities.

I have tried group the list by uni; but after it i have not found a funtion to assing as a percentage.

  • 1
    Welcome to SO, Julio1000MX! https://stackoverflow.com/q/11562656/3358272 is a good start for summarizing by group. If you don't provide sample data (see https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info), any answers you get will be based on different data, which may or may not inform your question. Go through that link and if you still can't figure it out, [edit] your question and post the output from `dput(head(uni_data,20))` into a code block. Thanks! – r2evans Oct 31 '22 at 18:46

1 Answers1

1

You can count to get the numbers, and then divide by the total to get a percentage:

library(dplyr)
nobel %>%
  count(name_of_university, category) %>%
  group_by(name_of_university) %>%
  mutate(proportion = n / sum(n)) %>%
  ungroup()
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294