I have a high-dimensional data frame df
with dimensions of 3000 x 80 (a document term matrix). I have a classification function that takes in two arguments: formula
and data
. For formula
, I want it to take all the features (variables) of df
automatically. Is there a way to take in a list of all column names to create a formula object?
Asked
Active
Viewed 323 times
0

Ben Bolker
- 211,554
- 25
- 370
- 453

juanjedi
- 140
- 1
- 7
-
Formulas can use a `.` to refer to all variables: `~ . ` See [this question](https://stackoverflow.com/q/13446256/4996248) – John Coleman Sep 30 '20 at 23:29
-
True, but wildcards can only be used with implemented functions such as `lm` and others right? What if my function doesn't support this? – juanjedi Sep 30 '20 at 23:31
2 Answers
4
You could probably do
reformulate(names(df))
which will produce a one-sided formula with all of the variable names. (It's really not much more than syntactic sugar for as.formula(paste(names(df), collapse="+"))
.)

Ben Bolker
- 211,554
- 25
- 370
- 453