0

My goal is to estimate two parameters of a model (see CE_hat). I use 7 observations to fit two parameters: (w,a), so overfitting occurs a few times. One idea would be to restrict the influence of each observation so that outliers do not "hijack" the parameter estimates. The method that has been previously suggested to me was nlrob. The problem with that however is that extreme cases such as the example below, return Missing value or an infinity produced when evaluating the model. To avoid this I used nlsLM which works towards a convergence at the cost of returning outlandish estimates.

Any ideas as to how I can use robust fitting with this example?

I include below a reproducible example. The observables here are CE, H and L. These three elements are fed into a function (CE_hat) in order to estimate "a" and "w". Values close to 1 for "a" and close to 0.5 for "w" are generally considered to be more reasonable. As you - hopefully - can see, when all observations are included, a=91, while w=next to 0. However, if we were to exclude the 4th (or 7th) observation (for CE, H and L), we get much more sensible estimates. Ideally, I would like to achieve the same result, without excluding these observations. One idea would be to restrict their influence. I understand that it might not be as clear why these observations constitute some sort of "outliers". It's hard to say something about that without saying too much I am afraid but I am happy to go into more details about the model should a question arise.

library("minpack.lm")
options("scipen"=50)
CE<-c(3.34375,6.6875,7.21875,13.375,14.03125,14.6875,12.03125)
H<-c(4,8,12,16,16,16,16)
L<-c(0,0,0,0,4,8,12)
CE_hat<-function(w,H,a,L){(w*(H^a-L^a)+L^a)^(1/a)}
aw<-nlsLM(CE~CE_hat(w,H,a,L), 
      start=list(w=0.5,a=1),
      control = nls.lm.control(nprint=1,maxiter=100))
summary(aw)$parameters
  • You could play with the `weights` argument to `nlsLM`. Performing `summary(lm(CE ~ H + L))` we see that `H` is the only significant variable so focusing on it and noticing that that its last 4 values are the same give a quarter weight to each giving `nlsLM(..., weights = c(1, 1, 1, .25, .25, .25, .25))`. Now the answer is closer to what you want. – G. Grothendieck Feb 06 '17 at 15:21
  • Thank you. This is quite useful! – Orestis Kopsacheilis Feb 07 '17 at 16:30

0 Answers0