-4

After data munging and using spread, I arrived at the following table: Complaint types and Boroughs

I would like to identify the top 4 issues in each Borough. Sort does not help since there are 4 Boroughs. Any thoughts on how to get?

tmthydvnprt
  • 10,398
  • 8
  • 52
  • 72
Jyo Nookula
  • 111
  • 1
  • 1
  • 6

1 Answers1

0

You can subset the complaint type column with order(column, decreasing=TRUE)[1:4]. It will return the greatest four values in the vector. It is then easy to convert that to whatever form is needed; here a data frame makes sense:

lst <- lapply(df[-1], function(col) df[,'Complaint.Type'][order(col, decreasing=T)[1:4]])
as.data.frame(lst)
#     BRONX BROOKLYN MANHATTAN   QUEENS
#1 Facility Facility     Adopt Facility
#2    Abuse    Abuse  Advocate    Adopt
#3     Park      Air      Park     Park
#4 Advocate    Adopt     Abuse Advocate

Data

df <- data.frame(Complaint.Type=c('Adopt', 'Advocate', 'Air', 'Abuse', 'Facility','Park'),
                 BRONX=c(0,5, 1, 33, 81, 7),
                 BROOKLYN=c(2,0,100,148,177, 1),
                 MANHATTAN=c(129,49,2,9,1,15),
                 QUEENS=c(50,3,0,3,2469,6))
Pierre L
  • 28,203
  • 6
  • 47
  • 69
  • worked like a charm! thank you! Could you help me understand what df[-1] and function(col) do? – Jyo Nookula Dec 04 '15 at 16:36
  • `df[-1]` removes the first column. We don't want to include the complaint types in the count – Pierre L Dec 04 '15 at 16:43
  • `function(col) df[,'Complaint......[1:4]]` is called an anonymous function. I made up the name `col`, I could have used any string to define the variable. col just makes sense as a name because we are talking about columns. A full explanation of anonymous functions can be found at https://www.safaribooksonline.com/library/view/the-art-of/9781593273842/ch07s13.html – Pierre L Dec 04 '15 at 16:46