Convert character to variable names used in formula in R

Question

I am beginner of R, I met a problem which might be simple for you. Thanks in advance if could give me some help. I am not sure whether the title does reflect the problem I want to ask. To make my problem clear, I will use a simple example.

Let's say we have data frame containing two factors (FE and DI) and three variables (SR1, SR2 and SR3) like:

df<-data.frame(FE=rep(c("FL","FM","FH"),4),DI=rep(c("DL","DH"),each=6),
SR1=rpois(12,10),SR2=rpois(12,15),SR3=rpois(12,20))

I know how to calculate the means of variables according to the factors by using "aggregate", for example:

df.me1<-aggregate(SR1~FE,df,mean)
df.me2<-aggregate(cbind(SR1,SR2,SR3)~FE+DI,df,mean)

Then, I make two characters (vars and facs) consisting of names of the three variables and the two factors:

vars<-c("SR1","SR2","SR3")
facs<-c("FE","DI")

Now, I want to do the calculations in the following formula for some reason

df.me1<-aggregate(vars[1]~facs[1],df,mean)
df.me2<-aggregate(cbind(vars[1],vars[2],vars[3])~facs[1]+facs[2],df,mean)

The codes certainly do not work, so what should I do to make them work in this way?

jdobres · Answer 1 · 2016-10-18T22:19:21.450

There are two ways to do this. One would be through aggregate's formula interface, which is what you're currently trying to do. In order to make this work, you'd have to create a string that includes your dependent and independent variables. Then you'd convert that string to a formula object using as.formula(). This is overcomplicated, since it requires a lot of witchcraft with sprintf and/or paste.

A simpler way to do this would be through aggregate's by argument, which is a little more friendly for substitutions made through variable names.

df.me1 <- aggregate(df[vars[1]], by = df[facs[1]], FUN = mean)

  FE   SR1
1 FH 10.00
2 FL 10.00
3 FM  9.25

df.me2 <- aggregate(df[vars], by = df[facs], FUN = mean)

  FE DI  SR1  SR2  SR3
1 FH DH  9.0 11.5 22.5
2 FL DH  8.0 16.5 21.5
3 FM DH 10.0 14.5 21.0
4 FH DL 11.0 16.5 18.0
5 FL DL 12.0 18.0 15.0
6 FM DL  8.5 13.0 24.0

Your solution works for the "aggregate" function example here, however, it doesn't solve my problem how to convert character to variable names used in a formula. Therefore, I prefer the answers using "eval(parse(text = "A string to execute"))" or "get()" which are more general solutions for my problem. Thank you anyway. — Myosotis, Oct 19 '16 at 13:21

Cyrillm_44 · Accepted Answer · 2016-10-19T01:50:33.107

For a more generic solution for dealing with strings in equations I like using the functionality eval(parse(text = "A string to execute")) for example in your code

eval(parse(text = paste("df.me1<-aggregate(",vars[1],"~",facs[1],",df,mean)",sep="")))

and I get the following result

> df.me1
  FE   SR1
1 FH  9.75
2 FL 10.75
3 FM 10.25

I also find that functionality useful when retrieving information in a list that is referenced by a string.

here is the paste command

> paste("df.me1<-aggregate(",vars[1],"~",facs[1],",df,mean)",sep="")
[1] "df.me1<-aggregate(SR1~FE,df,mean)"

For the second part

eval(parse(text = paste("df.me2<-aggregate(cbind(",vars[1],",",vars[2],",",vars[3],")~",facs[1],"+",facs[2],",df,mean)",sep="")))

Great, I like your answer best. I can see that "eval(parse(text = "A string to execute"))" is a more general solution and really solves my problem. Thank you so much! — Myosotis, Oct 19 '16 at 13:16

score 1 · Answer 3 · edited May 23 '17 at 11:53

@jdobres' answer is cleaner and probably better in most instances, but if you must do this exactly as you've written it, then referencing this answer, you can just use get().

df.me2<-aggregate(cbind(SR1,SR2,SR3)~FE+DI,df,mean)
df.me2.get<-aggregate(cbind(get(vars[1]),get(vars[2]),get(vars[3]))~get(facs[1])+get(facs[2]),df,mean)

And checking if they are the same:

df.me2 == df.me2.get

       FE   DI  SR1  SR2  SR3
[1,] TRUE TRUE TRUE TRUE TRUE
[2,] TRUE TRUE TRUE TRUE TRUE
[3,] TRUE TRUE TRUE TRUE TRUE
[4,] TRUE TRUE TRUE TRUE TRUE
[5,] TRUE TRUE TRUE TRUE TRUE
[6,] TRUE TRUE TRUE TRUE TRUE

Cool. That is the answer I am looking for, thanks so much! – Myosotis Oct 18 '16 at 22:46 — Myosotis, Oct 18 '16 at 22:46

Convert character to variable names used in formula in R

3 Answers3