14

I'm building a function for many model types which needs to extract the formula used to make the model. Is there a flexible way to do this? For example:

x <- rnorm(10)
y <- rnorm(10)
z <- rnorm(10)
equation <- z ~ x + y
model <- lm(equation)

I what I need to do is extract the formula object "equation" once being passed the model.

Matt Ball
  • 354,903
  • 100
  • 647
  • 710
mike
  • 22,931
  • 31
  • 77
  • 100

2 Answers2

16

You could get what you wanted by:

  model$call
# lm(formula = formula)

And if you want to see what I did find out then use:

str(model)

Since you passed 'formula' (bad choice of names by the way) from the calling environment you might then need to extract from the object you passed:

 eval(model$call[[2]])
# z ~ x + y

@JPMac offered a more compact method: formula(model). It's also worth looking at the mechanism used by the formula.lm function. The function named formula is generic and you use methods(formula) to see what S3 methods have been defined. Since the formula.lm method has an asterisk at its end, you need to wrap it in `getAnywhere:

> getAnywhere(formula.lm)
A single object matching ‘formula.lm’ was found
It was found in the following places
  registered S3 method for formula from namespace stats
  namespace:stats
with value

function (x, ...) 
{
    form <- x$formula
    if (!is.null(form)) {
        form <- formula(x$terms)
        environment(form) <- environment(x$formula)
        form
    }
    else formula(x$terms)
}
<bytecode: 0x36ff26158>
<environment: namespace:stats>

So it is using "$" to extract the list item named "formula" rather than pulling it from the call. If the $formula item is missing (which it is in your case) then It then replaces that with formula(x$terms) which I suspect is calling formula.default and looking at the operation of that function appears to only be adjusting the environment of the object.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks for responding -- how do I actually access the formula object "formula". If I try info <- model$call, info$formula just gives me a symbol object "formula". – mike Mar 14 '12 at 00:34
  • I would suggest `as.list(model$call)$formula` rather than `model$call[[2]]`. – flodel Mar 14 '12 at 00:49
  • @mike: I'm not sure what you are asking. You are the one that decided to call your formula, "formula", and then passed that as a named argument to `lm`. `fortunes::fortune("dog")` definitely applies here. – IRTFM Mar 14 '12 at 01:37
  • 6
    I'm late to the game...but I think you can do this: `formula(model)` to get the formula object that was passed to model. – ldecicco Apr 23 '13 at 18:01
  • That does look more direct indeed. – IRTFM Apr 23 '13 at 18:49
  • I recommend using `formula(model)` as suggested above. If one fits models using a function, then the call will not contain the actual formula used and one will get an error if one tries to use the formula found in the `$call` element. – CoderGuy123 May 02 '16 at 00:46
  • `formula(model)` is the CORRECT answer. All the other answers are incorrect. – Ben P.P. Tung Oct 23 '16 at 08:52
  • And interestingly you can pass ``lm(formula(model), data = data)`` directly instead of ``lm(formula = formula(model), data = data)``, though I'm not sure I'd recommend it. (I expected it to fail, as ``formula`` is precisely the name of the argument of ``lm()``) – PatrickT Apr 30 '18 at 17:18
6

As noted, model$call will get you the call that created the lm object, but if that call contains an object itself as the model formula, you get the object name, not the formula.

The evaluated object, ie the formula itself, can be accessed in model$terms (along with a bunch of auxiliary information on how it was treated). This should work regardless of the details of the call to lm.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187