1

What I am trying to do is to make a non-linear regression using possible submodels of my full model and then choose the most apropriate model using AIC criterion. The problem is to generate all possible submodels and then apply them to nls function to find the best fit.

Let's say I have a data:

x <- rnorm(100)
y <- 1+x+x^2-x^3-x^4+rnorm(100, sd=0.1)

And the full formula as function of the variable x and some parameters a, b, c, d, e:

full <- function(x, a, b, c, d, e){
  return(a + b*x + c*x^2 + d*x^3 + e*x^4)
}

(I know it is a silly example of non-linear model and I could use data transformation + linear model for it, but I want it to be simple)

I want to generate all possible submodels, skipping some parameters. I tried to just set those skipped parameters to zero:

skip <- function(args){
  # args = subset of c("a", "b", "c", "d", "e")
  return (function(x, a=0, b=0, c=0, d=0, e=0) {
    par <- c("a", "b", "c", "d", "e")
    parameters <- lapply(par, function(p){
      if(p %in% args){
        return (0)
      }
      else{
        return (get(p))
      } 
    })
    names(parameters) <- c("a", "b", "c", "d", "e")
    return (with(parameters, a + b*x + c*x^2 + d*x^3 + e*x^4))
  })
}

And I write a function to apply those formulas to nls:

apply_nls <- function(func, start){
  fit <- nls(y~func(x, a, b, c, d, e),
             start=start)
  return(fit)
}

The problem is that it does not work. If I do specify the starting value for ommited parameters:

apply_nls(skip("e"), start=list(a=1, b=1, c=1, d=-1, e=-1))

then I got an error message

singular gradient matrix at initial parameter estimates

(because indeed, my function do not depend on e parameter).

But when I do not specify starting values for b and d (I should be able to do it, because I specified the default vales of those parameters inside skip):

apply_nls(skip("e"), start=list(a=1, b=1, c=1, d=-1))

Then I got another error message:

parameters without starting value in 'data': e

I suppose I should restrict the parameters in skip and/or in apply_nls functions so they take only the parameters I need at that time, like:

apply_nls <- function(func, args, start){
  fit <- nls(y~func(x, args),
             start=start)
  return(fit)
}

But it does not work and I do not know how to properly implement it.

Marta Cz-C
  • 759
  • 1
  • 8
  • 18
  • Your example model is fully linear in its parameters. Please provide an example of a model that is actually non-linear. – Roland Aug 30 '15 at 11:21
  • We need a representative example. It's not obvious how you would define submodels of a function that is actually non-linear. – Roland Aug 30 '15 at 11:25
  • My real model is actually linear to most variables and exponential to one: `a + b*exp(c*x1) + d*x2 + ....` and I take submodels containing this nonlinear part `b*exp(c*x1)` and simplifying the linear terms. – Marta Cz-C Aug 30 '15 at 11:32

1 Answers1

1

If anyone is interested, I've solved this problem. The function apply_nls works when it is in the form:

apply_nls <- function(func, par, start){
  fit <- nls(y~do.call(func, args=append(list(x=x), mget(par))),
             start=start)
  return(fit)
}

Here mget returns value of each parameter given the parameter name (as a string) and do.call enable to feed the func with resulting arguments. This func is a function (sub-formula) after skipping some parameters, par are the remaining parameters and start are starting values for those parameters. So application of apply_nls looks like this:

apply_nls(skip("e"), par=c("a", "b", "c", "d"), start=list(a=1, b=1, c=1, d=-1))

To get all fits for submodels:

1) I assign parameters names and start values for all of them

parameters <- c("a", "b", "c", "d", "e")
start <- list(a=1, b=1, c=1, d=-1, e=-1)

2) I make a list of all combination of dropped parameters

drops <- append(NA, c(parameters,
           combn(parameters, 2, simplify=F),
           combn(parameters, 3, simplify=F),
           combn(parameters, 4, simplify=F)))

3) I write two functions that would return remaining parameters or start values given the parameters to drop:

choose_starts <- function(args, start){
  return(start[!(names(start) %in% args)])
}

choose_pars <- function(args, all_pars){
  return(all_pars[!all_pars %in% args])
}

4) I create all combinations of formulas, parameters and starting values given the skipped parameters:

all_formulas <- lapply(drops, skip)

all_starts <- lapply(drops, choose_starts, start)

all_pars <- lapply(drops, choose_pars, parameters)

5) I fit the nonlinear models for all the above.

all_fits <- mapply(apply_nls, all_formulas, all_pars, all_starts, SIMPLIFY=F)
Marta Cz-C
  • 759
  • 1
  • 8
  • 18