I want to write a function that will generate an new variable based on relations specified by the user. For example, given the data frame:
d=structure(list(x1 = c(1.51402536388423, 2.46080908251235, 0.0820537335444602,
0.397916902799275, 1.95703984456426, 0.339037316676135, -0.0983477082382985,
-0.811438758653617, -0.22166264965645, -1.24251846727355), x2 = c(1.31813185688133,
1.72398579121766, -0.193614904270392, 0.432834246728345, 1.59997674335209,
0.600172345889666, -0.215380204258891, -0.561283409895365, 0.042565271836392,
-1.19165094830462), x3 = c(0.811032464442614, 0.775382517472752,
-0.513659338850135, 1.88476174946952, -0.609641201640788, -1.64673649834054,
-2.0395881504007, -0.0752358173117906, -1.23648041024926, 2.4485419578765
)), .Names = c("x1", "x2", "x3"), row.names = c(NA, -10L), class = "data.frame")
The user may specify something like y~.5*x1+.2*x2+.4*x3
to create a new variable y
. This is trivially easy to do for one variable but I don't know how to generalize this. Thus,
How do I write a function that identifies the variables selected and creates a new variable based on these weights?
I think the function would contain 2 arguments (NewVariable=function(model,data)
) but I'm not sure what to do next.
Note that this question is similar to the question: extract variables in formula from a data frame, except the user would specify "regression weights".