0

I want to generate data to show how quantile regression can be useful, in particular when linear regression shows that there is no (mean) correlation between two variables, but there is a correlation in the upper and/or lower quantiles of data.

It is possible to find this behavior in this figure below:

changes in cutthroat trout density (y) as a function of the ratio of stream width to depth (X)

But I would like to recreate this behavior with data that is meaningful to my research field, which happens to be housing economics.

droubi
  • 93
  • 8

1 Answers1

0

I think I find a way of doing this. I don't know if it is the best way of doing it, but I did it: with the truncnorm package one can create truncated normal data, with which I could create this graph, which seems fine to me.

set.seed(5)
x = rtruncnorm(500, a = 125, b = 600, mean = 360, sd = 100)
b0 = 2500 # intercept chosen at your choice
b1 = 0 # coef chosen at your choice
h = function(x) 250 - .25*x # h performs heteroscedasticity function (here I used a linear one)
i = function(x) b0 + .25*x # lower bound limit
y = rtruncnorm(500, a = i(x), mean = b0, sd = h(x))
#y = b0 + b1*x + eps - h(x)
dados <- data.frame(VU = y, Area = x)
#X <- X[which(X[, "y"] > 2400), ]
fit <- lm(VU ~ Area, data = dados)
fit1 <- rq(VU ~ Area, tau = .10, data = dados)
fit2 <- rq(VU ~ Area, tau = .25, data = dados)
fit3 <- rq(VU ~ Area, tau = .50, data = dados)
fit4 <- rq(VU ~ Area, tau = .75, data = dados)
fit5 <- rq(VU ~ Area, tau = .90, data = dados)
fit6 <- rq(VU ~ Area, tau = .95, data = dados)
par(mar=c(5,5,2,6), bg = "light blue")
plot(VU ~ Area, data = dados, main = "A importância de analisar todos os quantis",
     xlab = expression("Área"~(m^2)),
     ylab = expression("VU (R$/"~m^2~")"))
abline(fit, lty = 2, lwd = 2)
abline(fit1, col = "yellow")
abline(fit2, col = "red")
abline(fit3, col = "green")
abline(fit4, col = "blue")
abline(fit5, col = "darkred")
abline(fit6, col = "darkblue")
legend('topleft', xpd = TRUE, inset=c(1,0), bty = "n",
       legend=c(".10", ".25", ".50", ".75", ".90", "0.95", "mean"),
       col = c("yellow", "red", "green", "blue", "darkred", "darkblue", "black"), 
       lty = c(rep(1, 6), 2),
       lwd = c(rep(1,6), 2),
       title= expression(~tau), text.font=4, bg='lightblue')
droubi
  • 93
  • 8