Select rows from a dataframe based on a condition and then assign a priority number to them in a new column

Question

I have a data set which has information on number of cases of an event reported in each grid.

no of cases	grid number
12	454
22	345
34	67

My task is to assign priorities to each of the grid boxes based on the number of cases which appear in them. So if the grid number 67 has 34 cases, that will be assigned priority 1. grid number 345 will be 2, and so on. The result should be something like the below.

no of cases	grid number	priority
34	67	1
22	345	2
12	454	3

If there happens to be a tie in assigning the priority number (if two different grids report equal number of cases) the priority should be assigned based on the sum of cases adjacent to the grid of interest. I hope I was able to convey my question clearly.

Being an absolute beginner in R, I am struggling to even begin doing this.

Will really appreciate some help here.

Thank you all!

I did not understand how you want to resolve ties. Can you update your post with an example which has ties and show it's expected output ? — Ronak Shah, Aug 25 '21 at 10:36

score 0 · Accepted Answer · answered Aug 25 '21 at 10:34

0

Is this what you are looking for?

df = data.frame("no_of_cases" = c(12,22,34), "grid_number" = c(454,345,67))

df %>% arrange(desc(no_of_cases)) %>% mutate("priority" = rank(-no_of_cases))

answered Aug 25 '21 at 10:34

Triss

561
3
11

score 0 · Answer 2 · answered Aug 25 '21 at 10:35

You can arrange the data in decreasing order and assign row number as priority.

library(dplyr)
df %>%
  arrange(desc(no_of_cases)) %>%
  mutate(priority = row_number())

#  no_of_cases grid_number priority
#1          34          67        1
#2          22         345        2
#3          12         454        3

Or in base R -

df$priority <- order(-df$no_of_cases)

data

It is easier to help if you provide data in a reproducible format -

df <- structure(list(no_of_cases = c(12L, 22L, 34L), 
grid_number = c(454L, 345L, 67L)), row.names = c(NA, -3L), class = "data.frame")

score 0 · Answer 3 · answered Aug 25 '21 at 17:39

Using data.table

library(data.table)
setkey(setDT(df)[order(-no_of_cases), priority := .I], priority)[]

-output

   no_of_cases grid_number priority
1:          34          67        1
2:          22         345        2
3:          12         454        3

data

df <- structure(list(no_of_cases = c(12L, 22L, 34L), 
grid_number = c(454L, 345L, 67L)), row.names = c(NA, -3L),
 class = "data.frame")

Select rows from a dataframe based on a condition and then assign a priority number to them in a new column

3 Answers3

data