0

I currently have a data frame that looks like this...

Year           School     AveragePoints     
2012-2013     Alabama        2.5
2012-2013     Alabama        5.4
2012-2013     Alabama        10.4
2012-2013     Alabama        1.2
2012-2013     Alabama        9.2
2012-2013     Alabama        7.3

Each row represents a player on that team for that year. So the first row means that one player on Alabama for 2012-2013 averaged 2.5 points that year. The data frame is much longer than this, with more teams and the years continuing through 1997-1998. I want to know how to find out how many players averaged between 0-4, 4.1-9, 9.1-14, and >14.1 for each year per school. In other words for 2012-2013, how many players averaged in those 4 categories for Alabama. But I would need those numbers for each year for Alabama and the other schools involved. I think some form of an apply function should be used but I'm not sure.

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485

1 Answers1

0

If I understand your question correctly, you should look into cut first, and then into aggregate (or tapply) or possibly even table.

Here's how I would proceed:

Add a column with the cut results.

mydf$AP <- with(mydf, cut(AveragePoints, c(0, 4.1, 9.1, 14.1)))

Here's a table approach:

table(mydf[c("AP", "Year", "School")])
# , , School = Alabama
# 
#             Year
# AP           2012-2013
#   (0,4.1]            2
#   (4.1,9.1]          2
#   (9.1,14.1]         2

However, the output for aggregate would probably be a much more useful format.

aggregate(. ~ Year + School + AP, mydf, length)
#        Year  School         AP AveragePoints
# 1 2012-2013 Alabama    (0,4.1]             2
# 2 2012-2013 Alabama  (4.1,9.1]             2
# 3 2012-2013 Alabama (9.1,14.1]             2
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485