0

I have a lifelines model which I fit using the following:

model = WeibullAFTFitter()
model.fit(train, 'duration', event_col='y', show_progress=True)

However, the time duration it predicts for my test set is extremely large (by using predicted_time = model.predict_expectation(test)). In fact in the uncensored case the average error between test duration and predicted duration is 2289.3773 +/- 7584.9916.

The only issue is that the maximum possible duration is 1500 (Assume the machines are replaced every 5 years). So my questions are:

  1. Is there a way to set an upper limit on time?
  2. If I normalise the duration to have 0 mean and standard deviation of 1, would the duration estimates improve?
sachinruk
  • 9,571
  • 12
  • 55
  • 86
  • Have you tried using median predictions instead (`predict_median`)? These are typically more stable (as the expectation can get very large as you've seen). – Cam.Davidson.Pilon May 20 '19 at 13:51
  • Yep. They are both quite large. – sachinruk May 20 '19 at 21:13
  • Maybe Weibull isn't the best fit. Do you have a c-index (`score_`) close to 0.5? Is there a particular covariate that has a very large coefficient? – Cam.Davidson.Pilon May 21 '19 at 01:17
  • The model.score_ is 0.85 and the largest coefficient is 3.71 followed by 2.57. I couldn't manage to get the other models to converge unfortunately, had issues with matrix inversion in other models. – sachinruk May 21 '19 at 06:44

0 Answers0