2

Supose I have this variables:

data <- data.frame(x=rnorm(10), y=rnorm(10))
form <- 'z = x*y'

How can I compute z (using data's variables) and add as a new variable to data?

I tried with parse() and eval() (base on an old question), but without success :/

Community
  • 1
  • 1
Rcoster
  • 3,170
  • 2
  • 16
  • 35
  • `data$z <- data$x * data$y`...? I guess that is not what you want? Can you elaborate on what you try to achieve? – Mark Heckmann Jan 07 '14 at 17:32
  • 3
    @MarkHeckmann: I guess `form` is dynamically read from somewhere and not known in advance. – nico Jan 07 '14 at 17:35
  • Does `form` have to come like this, as a character string representing some expression to be evaluated? – Gavin Simpson Jan 07 '14 at 17:49
  • Yes. `form` is a parameter from a function to create new variables to split and/or subset the data before run the analysis. In this example, I would have another parameter like 'z > 2' to analyze only the cases that z is bigger than 2. – Rcoster Jan 07 '14 at 18:03

2 Answers2

4

Given what @Nico said is correct you might do:

d1 <- within(data, eval(parse(text=form)) )
d1
            x           y           z
1   0.5939462  1.58683345  0.94249368
2   0.3329504  0.55848643  0.18594826
3   1.0630998 -1.27659221 -1.35714497
4  -0.3041839 -0.57326541  0.17437812
5   0.3700188 -1.22461261 -0.45312970
6   0.2670988 -0.47340064 -0.12644474
7  -0.5425200 -0.62036668  0.33656135
8   1.2078678  0.04211587  0.05087041
9   1.1604026 -0.91092165 -1.05703586
10  0.7002136  0.15802877  0.11065390
Mark Heckmann
  • 10,943
  • 4
  • 56
  • 88
0

transform() is the easy way if using this interactively:

data <- data.frame(x=rnorm(10), y=rnorm(10))

data <- transform(data, z = x * y)

R> head(data)
        x        y        z
1 -1.0206  0.29982 -0.30599
2 -1.6985  1.51784 -2.57805
3  0.8940  1.19893  1.07187
4 -0.3672 -0.04008  0.01472
5  0.5266 -0.29205 -0.15381
6  0.2545 -0.26889 -0.06842

You can't do this using form though, but within(), which is similar to transform(), does allow this, e.g.

R> within(data, eval(parse(text = form)))
         x        y         z
1  -0.8833 -0.05256  0.046428
2   1.6673  1.61101  2.686115
3   1.1261  0.16025  0.180453
4   0.9726 -1.32975 -1.293266
5  -1.6220 -0.51079  0.828473
6  -1.1981  2.62663 -3.147073
7  -0.3596 -0.01506  0.005416
8  -0.9700  0.21865 -0.212079
9   1.0626  1.30377  1.385399
10 -0.8020 -1.04639  0.839212

though it involves some amount of jiggery-pokery with the language which to my mind is not elegant. Effectively, you are doing something like this:

R> eval(eval(parse(text = form), data), data, parent.frame())
 [1]  0.046428  2.686115  0.180453 -1.293266  0.828473 -3.147073  0.005416
 [8] -0.212079  1.385399  0.839212

(and assigning the result to the named component in data.)

Does form have to come like this, as a character string representing some expression to be evaluated?

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453