2

I need to make a million tables, so rather than loop, I'm trying to functionalize my code. However, I can't get a function to work, and I can't figure out why. Here are some example data to work with:

  # example data:
set.seed(1)
dframe <- data.frame(time=c(rep("before", 6), rep("after", 4)), 
                     Quest1=sample(0:5, 10, replace=TRUE))
dframe
  # result: 
     time Quest1
1  before      1
2  before      2
3  before      3
4  before      5
5  before      1
6  before      5
7   after      5
8   after      3
9   after      3
10  after      0

Here is the code I've tried:

  # works:
tab1 <- prop.table(with(dframe, table(time, factor(Quest1, c(0:5)))), 1)
tab1
  # result:
time         0     1     2     3     4     5
after  0.250 0.000 0.000 0.500 0.000 0.250
before 0.000 0.333 0.167 0.167 0.000 0.333

  # doesn't work:  
makeTab = function(data, rowVar, colVar) {
  prop.table(with(data, table(rowVar, factor(colVar, c(0:5)))), 1)
}
tab1 <- makeTab(dframe, time, Quest1)
  # result:
Error in factor(colVar, c(0:5)) : object 'Quest1' not found

  # works:
tab1 <- prop.table(table(dframe$time, factor(dframe$Quest1, c(0:5))), 1)
tab1
  # (result same as above)

  # doesn't work:  
makeTab = function(data, rowVar, colVar) {
  prop.table(table(data$rowVar, factor(data$colVar, c(0:5))), 1)
}
tab1 <- makeTab(dframe, time, Quest1)
tab1
  # result:
     0 1 2 3 4 5

Note that looping does work:

  # works:  
tab <- list()
for(i in 1:1){
  tab[[i]] <- prop.table(table(dframe$time, factor(dframe[,i+1], c(0:5))), 1)
}
tab
  # result:
[[1]]

             0     1     2     3     4     5
  after  0.250 0.000 0.000 0.500 0.000 0.250
  before 0.000 0.333 0.167 0.167 0.000 0.333
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79

3 Answers3

4

You need to use get. Your first example will work if you wrap rowVar and colVar in get

makeTab = function(data, rowVar, colVar) {
  prop.table(with(data, table(get(rowVar), factor(get(colVar), c(0:5)))), 1)
}
tab1=makeTab(dframe, 'time', 'Quest1')

tab1

#                 0         1         2         3         4         5
#  after  0.2500000 0.0000000 0.0000000 0.5000000 0.0000000 0.2500000
#  before 0.0000000 0.3333333 0.1666667 0.1666667 0.0000000 0.3333333

Or in your second example use [ rather than $. Currently you're asking for the column rowVar rather than its value:

makeTab = function(data, rowVar, colVar) {
  prop.table(table(data[, rowVar], factor(data[, colVar], c(0:5))), 1)
}

Also note I'm passing the strings of my column names ('Quest1') rather than an object named Quest1.


As Joran mentioned, the second option is probably preferable since using get can often have unforeseen consequences!

Justin
  • 42,475
  • 9
  • 93
  • 111
  • Thanks for your help. This works. I'll use the latter suggestion. What is the principle at work here, though? I was unfamiliar w/ `get`, the code works fine if it's not embedded in a function. W/ the function you need 2 things: the alternative code, & the quotation marks. Why does the function make it different from straight code or loop? – gung - Reinstate Monica Jul 11 '13 at 18:38
  • 2
    @gung The code is not the same. You've replaced names of columns with variables that contain the name of the columns in your first function. So instead you need to use `get` to get the value of the variable out. In general, I find that `with` and the ways it works can be confusing at best. I only use it when I want to write less, instead opt for the subsetting choice using `[` especially programatically. – Justin Jul 11 '13 at 18:41
  • the second version is probably safer and is more in line with what R users usually do. – dickoa Jul 11 '13 at 18:46
4

Because you mentioned millions of tables and I fear from your loop construct you intend to use loops (e.g., lapply), here is an alternative to do two (or more) columns:

set.seed(1)
dframe <- data.frame(time=c(rep("before", 6), rep("after", 4)), 
                     Quest1=sample(0:5, 10, replace=TRUE),
                     Quest2=sample(0:5, 10, replace=TRUE))

library(reshape2)
dframe <- melt(dframe,id.vars="time")

tab <- prop.table(table(dframe$time,factor(dframe$value, c(0:5)),dframe$variable), 
                  c(1,3))

# , ,  = Quest1
# 
# 
#                0         1         2         3         4         5
# after  0.2500000 0.0000000 0.0000000 0.5000000 0.0000000 0.2500000
# before 0.0000000 0.3333333 0.1666667 0.1666667 0.0000000 0.3333333
# 
# , ,  = Quest2
# 
# 
#                0         1         2         3         4         5
# after  0.0000000 0.0000000 0.2500000 0.0000000 0.5000000 0.2500000
# before 0.0000000 0.3333333 0.3333333 0.0000000 0.3333333 0.0000000
Roland
  • 127,288
  • 10
  • 191
  • 288
3

If you want to stick to the design of your function, you can use eval too.

makeTab = function(data, rowVar, colVar)
    prop.table(with(data, table(eval(rowVar), factor(eval(colVar), 0:5))), 1)

makeTab(dframe, time, Quest1)
##              0       1       2       3       4       5
## after  0.25000 0.00000 0.00000 0.50000 0.00000 0.25000
## before 0.00000 0.33333 0.16667 0.16667 0.00000 0.33333
dickoa
  • 18,217
  • 3
  • 36
  • 50
  • for posterity [The dangers of eval(parse(...))](http://stackoverflow.com/questions/13649979/what-specifically-are-the-dangers-of-evalparse) – Justin Jul 11 '13 at 18:39
  • @Justin I use quite often though...but was useful to go through this discussion. thks – dickoa Jul 11 '13 at 18:44