I have a dataset that looks like the following:
INCOME | WEALTH |
---|---|
10.000 | 100000 |
15.000 | 111000 |
14.200 | 123456 |
12.654 | 654321 |
I have many more rows.
I now want to now find how much INCOME a household in a specific WEALTH percentile has. The following quantiles are relevant:
c(0.01,0.05,0.1,0.25,0.5,0.75,0.9,0.95,0.99)
I have always used the following code to get specific percentile values:
a <- quantile(WEALTH, probs = c(0.01,0.05,0.1,0.25,0.5,0.75,0.9,0.95,0.99))
But now I want to base my percentiles on WEALTH but get the respective INCOME. I have tried the following code but the results are not plausible:
df$percentile = ntile(df$WEALTH,100)
df <- df[df$percentile %in% c(1,5,10,25,50,75,90,95,99), ]
a <- df %>%
group_by(percentile) %>%
summarise(max = max(INCOME))
The results that I get a not consistent with other parts of the analysis that I have done. I assume that the percentile when using the "quantile" function are calculated differently that simply taking the maximum.