5

I've got a csv with some results. I want to loop over the results, run a power.prop.test and output the result for each row in a new column. So far I've got this:

data <- read.csv("Downloads/data.csv", sep = ",", header = TRUE)

for (i in 1:nrow(data)) {
  n <- power.prop.test( p1 = data[i,5], p2 = data[i,6], sig.level=.1, power = .8, alternative = "one.sided")
  data <- cbind(data, n[1])
}
head(data)

Rather than populating one column with the output, I'm looping through and creating a new column for ever power.prop.test I'm running. I'm binding a new column for each output instead of populating one column with each output. Issue is I'm not sure how to achieve the latter.

If anyone has any advice on how to consolidate these outputs into one column that would be great.

Thanks!

Uriil
  • 11,948
  • 11
  • 47
  • 68
user2471446
  • 109
  • 1
  • 3
  • 8
  • Create a new column in advance of the desired data type (e.g., `data$output = 0`), and then write `data[i,'output'] = (relevant value)` in the loop. – Max Candocia Jul 07 '14 at 21:17
  • Just a head's up, `read.csv("Downloads/data.csv")` would be fine. You included the default values for `sep` and `header` – Rich Scriven Jul 07 '14 at 21:18
  • Power prop tests are described here: http://stat.ethz.ch/R-manual/R-patched/library/stats/html/power.prop.test.html – Andy Clifton Jul 07 '14 at 21:20
  • @user2471446: You're looking for a new column per loop, right? I admit, I'm not sure if `dplyr` is able to do that, sorry. I'd love to see a solution using this package, though. – maj Jul 07 '14 at 21:50
  • @user1362215: I'm very close your suggestion works, but i get the following error on 8th record "provided 8 variables to replace 1 variables". So it only populates column for first 8 rows. Thanks again for help. – user2471446 Jul 07 '14 at 21:55
  • I think it has to do with the fact that the function returns a list. Am getting this error' Error in uniroot(function(n) eval(p.body) - power, c(1, 1e+07)) : f() values at end points not of opposite sign'. Looking into it...... – user2471446 Jul 07 '14 at 22:10
  • What kind of value are you trying to extract from the test? Is it a p-value? The test itself returns a `list` of 8 objects, which shouldn't be directly assigned to a column. You need to do list_object$name, list_object[['name']], or list_object[[index]] in order to get the desired value out of it. – Max Candocia Jul 07 '14 at 22:15
  • I'm trying to extract n, the number of required observations. I thought I was subsetting correctly, but apparently not. Will try your suggestions, thanks! – user2471446 Jul 07 '14 at 22:21
  • @user1362215: when i run the test, i want to index the first element, i can do this using [1]. but it won't work in my loop. can you explain what to do with list_object$name, etc. i can't find these in help. – user2471446 Jul 07 '14 at 22:42

4 Answers4

6

Try this:

data <- read.csv("Downloads/data.csv", sep = ",", header = TRUE)

data$newcolumn <- 0

for (i in 1:nrow(data)) {
  n <- power.prop.test( p1 = data[i,5], p2 = data[i,6], sig.level=.1, power = .8, alternative = "one.sided")
  data$newcolumn[i] <- n
}
head(data)

I just added a new column, filled it with zeroes, and then added in the power.prop.test values one at a time as they are calculated.

rucker
  • 393
  • 3
  • 13
1

Thanks for all the help! Here's the solution I'm using for now:

# Read in csv of results
data <- read.csv("Downloads/data.csv")
data$obs=0
data$obs2=0
data$sig=0

# Create a loop to calculate required observations for each test based on CR,     Significance and Statistical Power
for (i in 1:nrow(data)) {
# Error handling: where CR are the same, cannot calculate n and fails, we skip these tests
  possibleError <- tryCatch(
      n <- power.prop.test( p1 = data[i,5], p2 = data[i,6], sig.level=.2, power = .8, alternative = "one.sided")
  ,error=function(e)e
  )
  if(!inherits(possibleError, "error")){
# Real work: calculate n, determine if bigger than actual observations, if not assign   sig = 0, otherwise =1
   n <- power.prop.test( p1 = data[i,5], p2 = data[i,6], sig.level=.2, power = .8,  alternative = "one.sided")
    data[i,'obs'] = n[[1]]
    data$obs2[i] = data$obs[i] * 2
    if(data$obs[i]*2 < data$Traffic[i]) {
      data$sig[i] <- 1
    } else {
      data$sig[i] <-0
    }
    }
    }    
# End for loop

write.csv(data,file = "dataClean.csv")

This runs the power.prop.test on each row, includes error handling and a condition to tell you whether you have enough observations. I'm sure there are more efficient ways to write this, so I'll review your comments to see if I can incorporate them.

user2471446
  • 109
  • 1
  • 3
  • 8
0

Looks like a job for apply. To avoid any possible interference with factor or character variables, I'm just submitting the two columns as the argument.

data$power.col <- apply(data[5:6], 1,  
                      function(x) power.prop.test, p1 = x[1], p2 = x[2], 
                         sig.level=.1, power = .8, alternative = "one.sided")
      }
IRTFM
  • 258,963
  • 21
  • 364
  • 487
-1

First of all, I should say that I have never heard of power-prop tests. But then again, your question really isn't about the statistics, right?

Secondly, there is one package that is able to add columns to a data.frame, namely dplyr.

To quote the documentation: mutate(mtcars, displ_l = disp / 61.0237) - "Add a column displ_l (to mtcars) containing the values from the disp column divided by 61.0237."

maj
  • 2,479
  • 1
  • 19
  • 25