Here's a brief description of the data I have: Survival data from 4 separate studies that compares the survival rates among 20 groups. Each study lasted a different amount of time. For example, study 1 lasted 42 days and Study 2 lasted 50 days.
Here's a snapshot of the data:
UniqueID Time Censored Group1 Group2 Study
ABC123 6 1 1 111 1
DEF456 42 0 1 112 1
GHI789 42 0 2 344 1
JKL012 38 1 2 564 1
MNO345 19 1 10 761 1
PQR678 13 1 5 222 2
STU901 5 1 20 333 2
VWX234 50 0 15 444 2
YZA567 20 1 15 555 2
BCD890 50 0 12 555 2
Here's what I want to do: I want to create a function that allows the user to select two parameters (Study, Group1) to compare survival rates.
This is what I have attempted so far:
SurvA=function(a,b){
setwd("path to my file")
data=read.xlsx("mydata.xlsx",sheet=1)
data_study$Study==a
list(unique(data_study$Group1))
}
I want to write a loop that scans the list for all the unique Group1 numbers and create Group1 specific variables with the following logic as an example:
data_study$Group1_10=ifelse(data_study$Group1==10,1,0)
data_study$Group1_12=ifelse(data_study$Group1==12,1,0)
I'm unsure of how to proceed with the loop that would make this happen.
Once that is finalized, the rest of the code would look like this:
library(survival)
library(survminer)
SurvA=function(a,b){
setwd("path to my file")
data=read.xlsx("mydata.xlsx",sheet=1)
data_study$Study==a
list(unique(data_study$Group1))
#LOOP
surv_object=Surv(time=data_study$Time,event=data_study$Censored)
fit=survfit(surv_object~b,data=data_study)
ggsurv=ggsurvplot(fit,data=data_study,pval=TRUE,xlim=c(0,60),
title='Study 'a' Survival Plot for Group 'b' ',xlab="Time (days)")
ggsurv$plot=ggsurv$plot+theme(plot.title=element_text(hjust=0.5))
print(ggsurv)
}
Any help would be appreciated! Also, if you have suggestions for more efficient ways to write what I've already got - I would be very happy to learn of better ways to do this.