I would like to obtain quantile in a tailored subset. For example in the following dataset:
data = data.table(x=c(rep(1,9),rep(2,9)),y=c(rep(1:6,each=3)),z=1:18)
- For each row i, I want to know, in the rows with x=x[i] and y=<y[i], the 50%tile (as well as other quantiles in further calculations, e.g. 10%tile, 5%tile) of z[i].
An expected output would be
c(2,2,2,3.5,3.5,3.5,5,5,5,11,11,11,12.5,12.5,12.5,14,14,14)
- For each row i, I want to know, in the rows with x=x[i] and y=<y[i], the mean of z[i].
An expected output would be (same as 1 in this dataset, but would be different in other datasets).
c(2,2,2,3.5,3.5,3.5,5,5,5,11,11,11,12.5,12.5,12.5,14,14,14)
I can write a function for it and use apply to loop it over each row. However, the dataset has more than 30,000,000 rows which would take days. Is there a quicker way to calculate it, in R data.table or tidyverse or other packages?