0

I am trying to create a simple function in R that can reference multiple datasets and multiple variable names. Using the following code, I get an error, which I believe is due to referencing:

set.seed(123)
dat1 <- data.frame(x = sample(10), y = sample(10), z = sample(10))
dat2 <- data.frame(x = sample(10), y = sample(10), z = sample(10))

table(dat1$x, dat1$y)
table(dat2$x, dat2$y)

fun <- function(dat, sig, range){print(table(dat$sig, dat$range))}

fun(dat = dat1, sig = x, range =  y)
fun(dat = dat2, sig = x, range =  y)

Any idea how to adjust this code so that it can return the table appropriately?

coding_heart
  • 1,245
  • 3
  • 25
  • 46

1 Answers1

3

The [[ ]] operator on data frame is similar to $ but allows you to introduce an object and look for it's value. Then outside of the function you assign "x" value to sig. if you don't put quotes there R will look for x object

fun <- function(dat, sig, range){print(table(dat[[sig]], dat[[range]]))}

fun(dat = dat1, sig = "x", range =  "y")
fun(dat = dat2, sig = "x", range =  "y")
r.bot
  • 5,309
  • 1
  • 34
  • 45
Andriy T.
  • 2,020
  • 12
  • 23
  • 1
    [[ ]] operator on data frame is similar to $ but allows you to introduce an object and look for it's value. then outside of the function you assign "x" value to sig. if you don't put cuotes there R will look for x object – Andriy T. Jun 03 '15 at 17:50
  • 1
    Your comment should probably go into the answer itself. Comments are not guaranteed to stick around/not be deleted, so anything important to the conversation should be in the Q and A itself. – Frank Jun 03 '15 at 17:54
  • Good comment from @AndriyTkach due to R's not so great scoping decisions. Just to illustrate the point. `r = 1:10; f = function(x, r) {x[r]}` and `f(10:1) vs f(10:1, 10:1)` – Vlo Jun 03 '15 at 17:58
  • I fail to understand @Vlo's concern. In the first instance `r` is found outside the function and in the second instance with a different argument to R, a different result. Where is the difficulty? I admit to puzzlement that first doing `rm(r)` and then repeating `f(10:1)` is not throwing an error. – IRTFM Jun 03 '15 at 18:19
  • @BondedDust I think the concern is that `r` is an argument to the function, so it should at least warn when it looks for `r` there and fails to find it. I can't figure why `f` works without a second argument here, either; I suspect it's some edge case. Seems the whole thing is very tangential to this Q&A, in any case. – Frank Jun 03 '15 at 19:08
  • I found a sentence in the help page `?"["` that I suspect covers this: "An empty index selects all values: this is most often used to replace all the entries but keep the attributes." I think the issue relating to `r` being in the Global environment is bogus. Compare: `r = 1:10; f = function(x, z=NULL) {x[z]} ; f(10:1) #integer(0)` versus `r = 1:10; f = function(x, z) {x[z]} ; f(10:1) # [1] 10 9 8 7 6 5 4 3 2 1`. It's not the scoping issue at all but rather how "[" is defined. – IRTFM Jun 03 '15 at 19:42
  • Thank's for adding a comment to answer for me. I will take it in mind in my future answers – Andriy T. Jun 05 '15 at 06:40