This may seem like a silly question, but I was wondering why the median from median
and the median from survfit
("survival package") are different
I tried to simulate the tutorial in sciencing.com:
List the survival time of all the subjects in your sample. For example, if you have five students (in a real study, you'd have more) and their times to graduation were 3 years, 4 years (so far), 4.5 years, 3.5 years and 7 years (so far), write down the times: 3, 4, 4.5, 3.5, 7.
Put a plus sign (or other mark) next to any times that are right-censored (that is, those that have not had the event happen yet). Your list would look like this: 3, 4+, 4.5, 3.5, 7+.
So I created a data.frame (T
for dead and F
for alive):
survive <- data.frame(OS = c(3,4,4.5,3.5,7), status = c(T,F,T,T,F))
the median is 4 as sciencing.com says:
median(survive$OS)
[1] 4
but when I do survival analysis with "survival package" I get this:
Call: survfit(formula = Surv(OS, status) ~ 1, data = survive)
n events median 0.95LCL 0.95UCL
5.0 3.0 4.5 3.5 NA
So my question is why these two medians are different?
thanks