Upper limit on duration for Survival Analysis

Asked May 20 '19 at 05:48

Active May 20 '19 at 05:48

Viewed 110 times

I have a lifelines model which I fit using the following:

model = WeibullAFTFitter()
model.fit(train, 'duration', event_col='y', show_progress=True)

However, the time duration it predicts for my test set is extremely large (by using predicted_time = model.predict_expectation(test)). In fact in the uncensored case the average error between test duration and predicted duration is 2289.3773 +/- 7584.9916.

The only issue is that the maximum possible duration is 1500 (Assume the machines are replaced every 5 years). So my questions are:

Is there a way to set an upper limit on time?
If I normalise the duration to have 0 mean and standard deviation of 1, would the duration estimates improve?

asked May 20 '19 at 05:48

sachinruk

9,571
12
55
86

Have you tried using median predictions instead (`predict_median`)? These are typically more stable (as the expectation can get very large as you've seen). – Cam.Davidson.Pilon May 20 '19 at 13:51
Yep. They are both quite large. – sachinruk May 20 '19 at 21:13
Maybe Weibull isn't the best fit. Do you have a c-index (`score_`) close to 0.5? Is there a particular covariate that has a very large coefficient? – Cam.Davidson.Pilon May 21 '19 at 01:17
The model.score_ is 0.85 and the largest coefficient is 3.71 followed by 2.57. I couldn't manage to get the other models to converge unfortunately, had issues with matrix inversion in other models. – sachinruk May 21 '19 at 06:44

Upper limit on duration for Survival Analysis

0 Answers0