tl;dr: this is OK, except for some very special cases
Background
The .Environment
attribute in R contains a reference to the context in which an R closure (usually a formula or a function) was defined. An R environment is a store holding values of variables, similarly to a list. This allows the formula to refer to these variables, for example:
> f = function(g) return(y ~ g(x))
> form = f(exp)
> lm(form, list(y=1:10, x=log(1:10)))
...
Coefficients:
(Intercept) g(x)
3.37e-15 1.00e+00
In this example, the formula form
if defined as y~exp(x)
, by giving g
the value of exp
. In order to be able to find the value of g
(which is an argument to function f
), the formula needs to hold a reference to the environment constructed inside the call to function f
.
You can see the enviroment attached to a formula by using the attributes()
or environment()
functions as follows:
> attributes(form)
$class
[1] "formula"
$.Environment
<environment: R_GlobalEnv>
> environment(form)
<environment: R_GlobalEnv>
Your question
I believe you are using the nnet()
function variant with a formula (rather than matrices), i.e.
> nnet(y ~ x1 + x2, ...)
Unfortunately, R keeps the entire environment (including all the variables defined where your formula is defined) allocated, even if your formula does not refer to any of it. There is no way to the language to easily tell what you may or may not be using from the environment.
One solution is to explicitly retain only the required parts of the environment. In particular, if your formula does not refer to anything in the environment (which is the most common case), it is safe to remove it.
I would suggest removing the environment from your formula before you call nnet
, something like this:
form = y~x + z
environment(form) = NULL
...
result = nnet(form, ...)