Questions tagged [r-formula]

In R language formula objects store symbolic representations of variables. They are produced with a tilde `~` operator and are most often used for specifying statistical models. Use with the [r] tag

23 questions
236
votes
3 answers

Use of ~ (tilde) in R programming Language

I saw in a tutorial about regression modeling the following command: myFormula <- Species ~ Sepal.Length + Sepal.Width + Petal.Length + Petal.Width What exactly does this command do, and what is the role of ~ (tilde) in the command?
Ankita
  • 2,798
  • 4
  • 18
  • 25
92
votes
5 answers

Formula with dynamic number of variables

Suppose, there is some data.frame foo_data_frame and one wants to find regression of the target column Y by some others columns. For that purpose usualy some formula and model are used. For example: linear_model <- lm(Y ~ FACTOR_NAME_1 +…
Max
  • 4,792
  • 4
  • 29
  • 32
67
votes
10 answers

How to convert R formula to text?

I have trouble working with formula as with text. What I'm trying to do is to concatenate the formula to the title of the graph. However, when I try to work with the formula as with text, I fail: model <- lm(celkem ~ rok + mesic) formula(model) #…
Tomas
  • 57,621
  • 49
  • 238
  • 373
49
votes
3 answers

What does the R formula y~1 mean?

I was reading the documentation on R Formula, and trying to figure out how to work with depmix (from the depmixS4 package). Now, in the documentation of depmixS4, sample formula tends to be something like y ~ 1. For simple case like y ~ x, it is…
Antony
  • 5,414
  • 7
  • 27
  • 32
34
votes
2 answers

short formula call for many variables when building a model

I am trying to build a regression model with lm(...). My dataset has lots of features( >50). I do not want to write my code as: lm(output ~ feature1 + feature2 + feature3 + ... + feature70) I was wondering what is the short hand notation to write…
iinception
  • 1,945
  • 2
  • 21
  • 19
30
votes
3 answers

Error in terms.formula(formula) : '.' in formula and no 'data' argument

I'm tring to use neuralnet for prediction. Create some X: x <- cbind(seq(1, 50, 1), seq(51, 100, 1)) Create Y: y <- x[,1]*x[,2] Give them a names colnames(x) <- c('x1', 'x2') names(y) <- 'y' Make data.frame: dt <- data.frame(x, y) And now, I got…
luckyi
  • 465
  • 2
  • 6
  • 9
13
votes
2 answers

extract variables in formula from a data frame

I have a formula that contains some terms and a data frame (the output of an earlier model.frame() call) that contains all of those terms and some more. I want the subset of the model frame that contains only the variables that appear in the…
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
11
votes
1 answer

Formulas in user-defined functions in R

Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help…
gappy
  • 10,095
  • 14
  • 54
  • 73
10
votes
2 answers

Condition ( | ) in R formula

I found this pdf on R formulas and I am not able to figure out how the | works (see the table on the second page). Furthermore, I could not find any explanation on the web. It appears from time to time in lists for possible formula symbols but…
Alex
  • 4,925
  • 2
  • 32
  • 48
8
votes
5 answers

How can I replace one term in an R formula with two?

I have something along the lines of y ~ x + z And I would like to transform it to y ~ x_part1 + x_part2 + z More generally, I would like to have a function that takes a formula and returns that formula with all terms that match "^x$" replaced by…
rcorty
  • 1,140
  • 1
  • 10
  • 28
6
votes
1 answer

Use of Tilde (~) and period (.) in R

I'm going over looping with tidyverse and purrr using Hadley's R4DS book and am a little confused as to the exact usage of the tilde ~ symbol and period symbol. So when writing for loops, or using map(), instead of writing out function(), it appears…
Kevin Lee
  • 321
  • 2
  • 8
3
votes
1 answer

What does the ( | ) syntax mean in an R formula?

I am following a tutorial and came across the following syntax: # assume 'S' is the name of the subjects column # assume 'X1' is the name of the first factor column # assume 'X2' is the name of the second factor column # assume 'X3' is the name of…
Null Salad
  • 765
  • 2
  • 16
  • 31
1
vote
0 answers

What does ~ 1 mean in R function

Although I looked up the R docs for survfit{survival}, I couldn't see any information on this syntax ~ 1 in the formula survfit(Surv(time, status) ~ 1, data = lung). Could someone please help to explain what ~ 1 generally means in R?
Nemo
  • 1,124
  • 2
  • 16
  • 39
1
vote
1 answer

Create formula using the name of a data frame column

Given a data.frame, I would like to (dynamically) create a formula y ~ ., where y is the name of the first column of the data.frame. What complicates this beyond the approach of as.formula(paste(names(df)[1], "~ .")) is that the name of the column…
Mark
  • 200
  • 6
1
vote
2 answers

R substitute(), to substitute values in expression, is adding unnecessary quotes

I am trying to update a formula for a linear model in R, based on names of variables that I have stored in an array. I am using substitute() for that and the code is as follows. var = 'a' covar = c('b', 'c') covar = paste(c(var, covar), collapse = '…
Sapiens
  • 1,751
  • 2
  • 17
  • 19
1
2