0

I am trying to write a function which accepts Var1 and Var2 from user and run the t.test and return the mean for female classification. But I am getting an error for the calc line. If I run the program without the paste and as.formula functions and run with t.test(dat[[Var2]]~dat[[Var1]] I will get the correct answer.

But in my original code I am required to use the paste function. Could anyone let me know what is the mistake in the code below using paste and as.formula functions? I am using the quine dataframe from MASS library.

func = function(dat=quine,Var1,Var2){
  # calc = t.test(dat[[Var2]]~dat[[Var1]] #gives the answer
  calc = t.test(as.formula(paste(dat[[Var2]], dat[[Var1]], sep="~"))) #gives an error
  return(F.mean = calc$estimate[1])
}

func(Var1= "Sex", Var2= "Days")

Here is the head(quine)

Eth Sex Age Lrn Days

1 A M F0 SL 2

2 A M F0 SL 11

3 A M F0 SL 14

4 A M F0 AL 5

5 A M F0 AL 5

6 A M F0 AL 13

BeginnerR
  • 13
  • 2

2 Answers2

2

This should work:

func <- function(dat = quine, Var1, Var2){
  calc = t.test(as.formula(paste("dat[[Var2]]", "dat[[Var1]]", sep = "~"))) 
  return(F.mean = calc$estimate[1])
}

func(Var1 = "Sex", Var2 = "Days")

Note the difference between pasting a string and an object.

User 6683331
  • 692
  • 1
  • 13
  • 31
0

In the function include the code line

print(paste(dat[[Var2]], dat[[Var1]], sep="~"))

to see what is wrong. paste pastes each element of the vector dat[[Var1]] with each element of the vector dat[[Var2]]. The result is a vector of length nrow(dat). Then, only the first element is coerced to formula. And only that one is used by t.test.

The correct code would be (note the data argument):

func = function(dat=quine,Var1,Var2){
  # calc = t.test(dat[[Var2]]~dat[[Var1]] #gives the answer
  calc = t.test(as.formula(paste(Var2, Var1, sep="~")), data = dat)
  return(c(F.mean = unname(calc$estimate[1])))
}

Note also how the return instruction has changed.

Though we don't have sample data to test the function, we can make up something.

set.seed(8294)
n <- 100
quine <- data.frame(Sex = sample(c("M", "F"), n, TRUE), Days = runif(n))

func(Var1= "Sex", Var2= "Days")
#   F.mean 
#0.5100037
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66