-1

So let me be a little more specific..... i have a dataset that has

  1. SOCCERTEAM -PLAYERS

  2. BARCA - MESSI

  3. BARCA - MESSI
  4. BARCA - MESSI
  5. BARCA - XAVI

  6. -RM - CR

  7. -RM - CR

  8. -RM - PEPE

  9. -RM -HIQUAIN etc(just an example not dataset)

as columns!!!

I want the answer to this question : " How can i find the top 5 teams according to how many players they used" *teams can use players more than once so finding the factor levels are not a possibility *so if barca used 15 players and Rm used 14 then BARCA is first.....

Fallen Greg
  • 23
  • 1
  • 7
  • Try `library(data.table);head(setDT(df1)[, .(n = uniqueN(PLAYERS)), SOCCERTEAM][order(-n)]$SOCCERTEAM, 5)` – akrun May 21 '17 at 16:54
  • @akrun thnx for the help...it worked even though i can't really find the use of the part: .....[, .(n = uniqueN(PLAYERS)), SOCCERTEAM][order(-n)]$SOCCERTEAM, 5)... why after setDT(df1) we use [ ] ? – Fallen Greg May 21 '17 at 17:21
  • You should probably take a look at [Getting Started with `data.table`](https://github.com/Rdatatable/data.table/wiki/Getting-started). – Gregor Thomas May 21 '17 at 17:37

1 Answers1

0
library(dplyr)

df %>% 
  group_by(SOCCERTEAM) %>% 
  summarize(rank = n_distinct(PLAYERS)) %>%
  top_n(5, wt = rank)
yeedle
  • 4,918
  • 1
  • 22
  • 22