What I am trying to do is to make a non-linear regression using possible submodels of my full model and then choose the most apropriate model using AIC criterion. The problem is to generate all possible submodels and then apply them to nls
function to find the best fit.
Let's say I have a data:
x <- rnorm(100)
y <- 1+x+x^2-x^3-x^4+rnorm(100, sd=0.1)
And the full formula as function of the variable x
and some parameters a
, b
, c
, d
, e
:
full <- function(x, a, b, c, d, e){
return(a + b*x + c*x^2 + d*x^3 + e*x^4)
}
(I know it is a silly example of non-linear model and I could use data transformation + linear model for it, but I want it to be simple)
I want to generate all possible submodels, skipping some parameters. I tried to just set those skipped parameters to zero:
skip <- function(args){
# args = subset of c("a", "b", "c", "d", "e")
return (function(x, a=0, b=0, c=0, d=0, e=0) {
par <- c("a", "b", "c", "d", "e")
parameters <- lapply(par, function(p){
if(p %in% args){
return (0)
}
else{
return (get(p))
}
})
names(parameters) <- c("a", "b", "c", "d", "e")
return (with(parameters, a + b*x + c*x^2 + d*x^3 + e*x^4))
})
}
And I write a function to apply those formulas to nls
:
apply_nls <- function(func, start){
fit <- nls(y~func(x, a, b, c, d, e),
start=start)
return(fit)
}
The problem is that it does not work. If I do specify the starting value for ommited parameters:
apply_nls(skip("e"), start=list(a=1, b=1, c=1, d=-1, e=-1))
then I got an error message
singular gradient matrix at initial parameter estimates
(because indeed, my function do not depend on e
parameter).
But when I do not specify starting values for b
and d
(I should be able to do it, because I specified the default vales of those parameters inside skip
):
apply_nls(skip("e"), start=list(a=1, b=1, c=1, d=-1))
Then I got another error message:
parameters without starting value in 'data': e
I suppose I should restrict the parameters in skip
and/or in apply_nls
functions so they take only the parameters I need at that time, like:
apply_nls <- function(func, args, start){
fit <- nls(y~func(x, args),
start=start)
return(fit)
}
But it does not work and I do not know how to properly implement it.