I have the following table:
df = structure(list(test_id = c(1, 1, 1, 2, 2, 3, 3, 3, 3, 4, 4), person = structure(c(1L,
2L, 3L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 3L), .Label = c("a", "b",
"c", "d"), class = "factor"), points = c(1, 5, 2, 6, 5, 3, 4,
5, 6, 2, 1)), .Names = c("test_id", "person", "points"), row.names = c(NA,
-11L), class = "data.frame")
I want to determine a couple of scenarios and need your help on scenarios 2 & 3:
- Which single person tallies the most points across all the tests,
e.g. using dplyr:
df %>% group_by(person) %>% summarize(most_points = sum(points)) %>% top_n(1,most_points)
Which two people combine to tally the most points across all tests (taking the maximum point value between the two people for each test)
Expected output: person a & person b would be the best two person combination, with the sum of their points being 17 (b wins tests 1,2 & 3 and a wins test 4).
Which three people combine to tally the most points across all tests (taking the maximum point value between the three people for each test)
Expected output: person a, person b & person c would be the best three person combination, with the sum of their points being 19 (person b wins tests 1 & 2, person d wins test 3, person a wins test 4)
This is a very simplified table and in reality I have hundreds of thousands of rows I'll be analyzing with many more people and test_id's. Note that not everyone has a point value for each test. I'm not interested in a weighted average, simply maximum points they (or a combination of two/three of them) have accumulated.