I'm trying to fit a Cox Proportional Hazard model to analyze the impact of the number of protest events on the survival rates of different political regimes in different countries.
My dataset looks similar to this:
Country year sdate edate time evercollapsed protest GDPgrowth
Country A 2003 1996-11-24 2012-12-31 5881 0 78 14.78
Country A 2004 NA NA NA 0 99 8.56
Country A 2005 NA NA NA 0 25 3.56
Country B 2003 2000-10-26 2011-05-21 3859 1 13 2.33
Country B 2004 NA NA NA 1 28 5.43
Country B 2005 NA NA NA 1 7 1.89
So, basically my dataset provides yearly information on a number of variables for each year, but information about the start and end dates for the regime and the time of survival (measured in days) is only provided in the first row of each given political regime.
My data includes information for 48 different political regimes and 15 of them collapse in the time span I am looking at.
I fitted a Cox PH model with the survival package:
myCPH <- coxph(Surv(time, evercollapsed) ~ protest + GDPgrowth, data = mydata)
This gives me the following result:
Call:
coxph(formula = Surv(time, evercollapsed) ~ protest + GDPgrowth,
data = mydata)
coef exp(coef) se(coef) z p
protest 0.01630 1.01644 0.00722 2.26 0.024
GDPgrowth -0.03447 0.96612 0.01523 -2.26 0.024
Likelihood ratio test=9.26 on 2 df, p=0.00977
n= 48, number of events= 15
(556 observations deleted due to missingness)
So, these results imply that I'm losing 556 country years, because the rows in my data frame do not include the information on the survival time of the regime.
My question now is, how to include the country years into the analysis which do not provide the information on sdate, edate and time?
I assume, if I would just copy the information for each country-year, this would increase my number of regime collapses?
I assume I have to give an unique ID for every given political regime to make sure R can distinguish the different cases. Then, how do I have to fit the Cox PH model that includes the information of the differen country-years in the analysis?
Many thanks in advance!