1

I have a database of ~2000 observations and made a quantile regression on the 95th percentile using quantreg package.

I wanted to identify the observations that were actually used for calculating the slope and intercept for the 95th percentile regression in order to perform further analysis. Is there any way to do that?

This is the code for quantreg I used so far:

datos<-quantreg.example
library(quantreg)
rq(y ~ x, tau=0.95, data=datos, method="br", model = TRUE) 

and here is the data file: http://www.filedropper.com/quantregexample

lmo
  • 37,904
  • 9
  • 56
  • 69
Giuseppe Petri
  • 604
  • 4
  • 14
  • I'm voting to close this question as off-topic because it is about how to use R without a reproducible example. – gung - Reinstate Monica May 06 '16 at 14:21
  • I linked the database, and placed the code I used so far. So, I thing there is a reproducible example, or am I wrong? – Giuseppe Petri May 06 '16 at 15:07
  • Is it correct to say that quantile regression only fits on a subset of the data? Doesn't it use the full dataset to calculate the slope and intercept, see: http://www.econ.uiuc.edu/~roger/research/rq/QRJEP.pdf . – bouncyball May 06 '16 at 15:32
  • It is true that it uses the whole dataset but, conceptually, some of those points belong to the 95th percentile regression that was calculated. And therefore, those observations could be (somehow) identified or extracted. I need to identify those points in order to test how those observations differ (in terms of other variables) from the rest of the population. I will check the article you sent to see if I can found any pointer. – Giuseppe Petri May 06 '16 at 16:22
  • `data[data$y >= quantile(data$y, probs = 0.95), ]$y` gives y values that are greater than or equal to the 95th quantile – bouncyball May 06 '16 at 16:23
  • Yes, that is for the distribution of y regardless of the value of x. The thing I am looking for are the observations at the 95th quantile regression. – Giuseppe Petri May 06 '16 at 17:00
  • So for each level of `x`, extract the 95th quantile of `y`? `by(data, x, FUN = function(x) quantile(x$y, probs = .95))` ? – bouncyball May 06 '16 at 18:18
  • Yes,@bouncyball. I think it can be a way to retrieve the observations that were actually used in the 95th quantile regression. – Giuseppe Petri May 06 '16 at 18:23
  • @bouncyball, thanks for all your effort. I tried but gave me the following error: object 'x' not found. I am kind of new with R, some maybe I am doing some mistake. – Giuseppe Petri May 06 '16 at 18:35
  • @JoseLRotundo Maybe try: `by(datos, datos$x, FUN = function(d) quantile(d$y, probs = 0.95))` – bouncyball May 06 '16 at 19:04

1 Answers1

1

Alright, I thought I would whip up a quick solution to what I think your question is asking for:

set.seed(123)
library(dplyr) #data transformation
library(quantreg) #quantile regression
#make dummy data
df <- data.frame(x = sample(1:10, 200, replace = T))
df$y <- df$x + rnorm(200)
#fit quantile regression
my_q <- rq(y~x, data = df, tau = 0.95)
#use dplyr to get 95% quantile at each x
df_q <- df %>% group_by(x) %>% summarise(yq = quantile(y, probs = .95))
#quick viz with red points being 95% quantiles
with(df, plot(x,y))
legend('topleft',legend = '95% Conditional Quantiles',col = 'red',pch = 19, bty = 'n')
with(df_q, points(x, yq, col = 'red', pch = 19))
abline(reg = my_q)

enter image description here Hope this helps.

bouncyball
  • 10,631
  • 19
  • 31