I am trying to find a fast way to do the following:
- Determine the yearly quartile values of a database
- Compare (match) a specific variable in a database to its yearly quartile value
- Depending on the value, create a new variable with value of 0,1,2,3... (rankorder)
Here is a reproducible example
library(data.table)
dt <- data.table(rep(seq.int(2000,2010,1),30), runif(330,0,5))
colnames(dt) <- c("year","response") # Ignore warning
quarts <- function(x) {
quantile(x, probs = seq(0.25,0.75,0.25),na.rm=T, names=T)
}
setkey(dt, year)
a <- data.table(dt[,quarts(response), by = key(dt)])
Now data.table a
contains the needed quartile values of dt$response
for every year.
What I need to do now is to compare the value of dt$response
with the quartile values in a
and create a new variable dt$quartresponse
that takes
- Value 0 if
dt$response[i]
is smaller than the 0.25 quartile value for that specific year - Value 1 if
dt$response[i]
is between the 0.25 and 0.5 quartile value for that specific year - Value 2 if
dt$response[i]
is between the 0.50 and 0.75 quartile value for that specific year - Value 3 otherwise
I'm sure some kind of loop would work but there must be a more R-like way of solving this.
Any suggestions are welcome!
Simon