I am attempting to reproduce some survival analysis results published in a journal. The original results were produced in Stata. Here is the code:
stset time, id(leadid) failure(c_coup)
streg legislature leg_growth_2 gdp_1k chgdpen_fearonlaitin Oil_fearonlaitin postcoldwarlag civiliandictatorshiplag militarydictatorshiplag communist lpopl1_fearonlaitin ethfrac_fearonlaitin relfrac_fearonlaitin age, distribution(weibull) time
Here is my attempt to translate this to R
code:
# load packages
library(survival)
library(survminer)
# load data
data <- read.csv("leader_tvc_2.csv", encoding = "latin1")
# create survival object
surv_object <- Surv(time, c_coup) ~ legislature + leg_growth_2 + gdp_1k + chgdpen_fearonlaitin + Oil_fearonlaitin + postcoldwarlag + civiliandictatorshiplag + militarydictatorshiplag + communist + lpopl1_fearonlaitin + ethfrac_fearonlaitin + relfrac_fearonlaitin + age
# run survreg
fit <- survreg(surv_object, data = data, dist = "weibull")
# examine output
summary(fit)
I am unable to obtain the same results. For example, the estimate for legislature
in the Stata results (the ones I want to produce) is 2.057. In R
, however, it is 2.48223. Any advice would be appreciated. Here is a link to the full data.
Here is the R output:
Call:
survreg(formula = surv_object, data = data, dist = "weibull")
Value Std. Error z p
(Intercept) 3.69422 0.62813 5.88 4.1e-09
legislature 2.48223 0.21580 11.50 < 2e-16
leg_growth_2 4.06103 1.81609 2.24 0.0253
gdp_1k -0.02217 0.03678 -0.60 0.5467
chgdpen_fearonlaitin 0.49579 1.11792 0.44 0.6574
Oil_fearonlaitin 0.24194 0.23799 1.02 0.3094
postcoldwarlag 0.55797 0.31626 1.76 0.0777
civiliandictatorshiplag -1.87331 0.32142 -5.83 5.6e-09
militarydictatorshiplag -1.34963 0.29635 -4.55 5.3e-06
communist 0.88426 0.28211 3.13 0.0017
lpopl1_fearonlaitin -0.00252 0.06636 -0.04 0.9697
ethfrac_fearonlaitin 0.47922 0.28625 1.67 0.0941
relfrac_fearonlaitin 0.67571 0.37836 1.79 0.0741
age 0.01261 0.00704 1.79 0.0733
Log(scale) -0.12430 0.06339 -1.96 0.0499
Scale= 0.883
Weibull distribution
Loglik(model)= -829 Loglik(intercept only)= -978
Chisq= 297.98 on 13 degrees of freedom, p= 6.4e-56
Number of Newton-Raphson Iterations: 18
n=3774 (5911 observations deleted due to missingness)