0

enter image description here

Hey guys, so I taught myself time-to-event analysis recently and I need some help understanding it. I made some Kaplan-Meier survival curves.

Sure, the number of observations within each node is small but let's pretend that I have plenty.

K <- HF %>% 
  filter(serum_creatinine <= 1.8, ejection_fraction <= 25)


## Call: survfit(formula = Surv(time, DEATH_EVENT) ~ 1, data = K)
## 
##  time n.risk n.event survival std.err lower 95% CI upper 95% CI
##    20     36       5    0.881  0.0500        0.788        0.985
##    45     33       3    0.808  0.0612        0.696        0.937
##    60     31       3    0.734  0.0688        0.611        0.882
##    80     23       6    0.587  0.0768        0.454        0.759
##   100     17       1    0.562  0.0776        0.429        0.736
##   110     17       0    0.562  0.0776        0.429        0.736
##   120     16       1    0.529  0.0798        0.393        0.711
##   130     14       0    0.529  0.0798        0.393        0.711
##   140     14       0    0.529  0.0798        0.393        0.711
##   150     13       1    0.488  0.0834        0.349        0.682

If someone were to ask me about the third node, would the following statements be valid?:

For any new patient that walks into this hospital with <= 1.8 in Serum_Creatine & <= 25 in Ejection Fraction, their probability of survival is 53% after 140 days.

What about:

The survival distributions for the samples analyzed, and no other future incoming samples, are visualized above.

I want to make sure these statements are correct. I would also like to know if logistic regression could be used to predict the binary variable DEATH_EVENT? Since the TIME variable contributes to how much weight one patient's death at 20 days has over another patient's death at 175 days, I understand that this needs to be accounted for.

If logistic regression can be used, does that imply anything over keeping/removing variable TIME?

Antonio
  • 417
  • 2
  • 8

1 Answers1

1

Here are some thoughts:

Logistic regression is not appropriate in your case. As it is not the correct method for time to event analysis.

  • If the clinical outcome observed is “either-or,” such as if a patient suffers an MI or not, logistic regression can be used.

  • However, if the information on the time to MI is the observed outcome, data are analyzed using statistical methods for survival analysis.

Text from here

If you want to use a regression model in survival analysis then you should use a COX PROPORTIONAL HAZARDS MODEL. To understand the difference of a Kaplan-Meier analysis and Cox proportional hazards model you should understand both of them.

The next step would be to understand what is a univariable in contrast to a multivariable Cox proportional hazard model.

At the end you should understand all 3 methods(Kaplan-Meier, Cox univariable and Cox multivariable) then you can answer your question if this is a valid statement:

  • For any new patient that walks into this hospital with <= 1.8 in Serum_Creatine & <= 25 in Ejection Fraction, their probability of survival is 53% after 140 days.

There is nothing wrong to state the results of a subgroup of a Kaplan-Meier method. But it has a different value if the statement comes from a multivariable Cox regression analysis.

TarJae
  • 72,363
  • 6
  • 19
  • 66
  • Hi, I really appreciate your response! `TIME` isn't the dependent variable when wanting to use Logistic Regression. It would be a dependent one. Would I be able to use Logistic in this case? – Antonio Dec 07 '22 at 22:43
  • 1
    If the outcome is independent of time and your outcome variable is yes or no (e.g. 0 and 1) then logistic regression is appropriate. As always check the assumptions for logistic regression. – TarJae Dec 08 '22 at 06:26