0

I am trying to run a survival analysis on a set of data I have collected. In this data frame (m3), each row is a new patient and each column is a mutation I have identified. I have made a binary data table to indicate whether each patient is positive or negative for the mutation. I can run a survfit function for each column(mutation), but I have hundreds and want to loop through them. I have written the following code, but don't think it is correct (nothing is being output).

for (i in m3[,2:256]) {survdiff(Surv(m3$Overall.Survival, m3$Status) ~ i, 
data = m3)}

Once I gather this data I want to make a table with each mutation (column) as a row and put the p-value from this survfit object as the column.

I'm not sure why I don't have any output for the for loop and even more so how to generate the new data frame. I believe I would be subsetting it.

Matt
  • 17
  • 1
  • 5
  • your loop is almost there, you just need to calculate the p-value (using the colon data set from survival package) `m <- colon[, c(10, 15, 3:9)]; lapply(m[, 3:ncol(m)], function(i) {s <- survdiff(Surv(m$time, m$status) ~ i); pchisq(s$chisq, length(unique(i)) - 1L, lower.tail = FALSE)})` – rawr Feb 10 '18 at 23:16
  • Can you elaborate why I should use the colon package? – Matt Feb 10 '18 at 23:33
  • There is no *colon* package. @rawr was using the *colon* dataset from *survival* package as example since you do not provide data for us to reproduce your issue. StackOverflow highly recommends an [MCVE](https://stackoverflow.com/help/mcve). – Parfait Feb 11 '18 at 02:03
  • Thank you for your clarification! What additional data would you like? I wasn't sure what else was needed. The datframe was m3, which includes in it overall survival, status, as well as binary notation for whether or not each patient has that mutation (which is denoted by columns 2:256). – Matt Feb 11 '18 at 02:14

0 Answers0