I'm trying to use the quantreg
package to fit an exponential curve.
Here is a reproductible example. IRL I have much more complex data with outliers, that's why I prefer not using nls
which is not robust to outliers.
library(quantreg)
library(ggplot2)
x = 1:100
set.seed(42)
y = 500*exp(-0.02*x) +rnorm(100, 0, 5 )
df = data.frame(cbind(x,y))
plot(df)
formula = y ~ k * exp(b*x)
qr_exp = nlrq(formula,
data = df,
start = list(k = 600, b = -0.01),
tau = .50,
nlrq.control(maxiter=1000))
summary(qr_exp)
sum(qr_exp$m$resid())
[1] -26.52373
I expected to have sum(qr_exp$m$resid())
around 0 since tau = 0.5
but the value is negative which means the model tend to overestimate the real values.
As you can see I have sum of the residual is closer to 0 with tau= 0.47
formula = y ~ k * exp(b*x)
qr_exp = nlrq(formula,
data = df,
start = list(k = 600, b = -0.01),
tau = .47,
nlrq.control(maxiter=1000))
summary(qr_exp)
sum(qr_exp$m$resid())
[1] -4.467781
I don't really understand why.
Is it because there could be an infinite number of solution and so no guarantee of having as much negative residual than positive residual?
If yes what is the best solution if this is very important for me to:
- minimise Least absolute deviation and not least square deviation (not robust with outliers)
- have balanced residual?
Could it make sense to add a small portion of L2 penalty to have something balanced ? (see Huber loss)