How to get the exact same standard errors obtained in Stata when reproducing survival analysis in R?

Question

I am reproducing in R some survival analysis results published in a journal. The original results were produced in Stata. Here are the original results:

Here is the code to produce these results in R:

# load packages
library(dplyr)
library(foreign)
library(msm)
library(stargazer)

# load Svolik's original data 
data = read_stata("leaders, institutions, covariates, updated tvc.dta")

# set a t0 for each row
data = mutate(data,t0 = lag(t,default=0), .by=leadid)

# coup survival object original
survobj_coup = Surv(data[["t0"]], data[["_t"]], data$c_coup)

# coup model original
coups_original <- coxph(survobj_coup~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
      data=data, ties="breslow")

# revolt survival object original 
survobj_revolt = Surv(data[["t0"]], data[["_t"]], data$c_revolt)

# revolt model original 
revolt_original <- coxph(survobj_revolt~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ mil+ cw+ age, 
                        data=data, ties="breslow")

# natural survival object original
survobj_natural = Surv(data[["t0"]], data[["_t"]], data$c_natural)

# natural model original
natural_original <- coxph(survobj_natural~legislature +  lgdp_1+ growth_1 +exportersoffuelsmainlyoil_EL2008+ ethfrac_FIXED+ communist+ mil+ cw+ age, 
                        data=data, ties="breslow")

# Define a function to exponentiate coefficients
exp_coef <- function(x) {exp(x) }

# Create the table using stargazer
stargazer(natural_original, coups_original, revolt_original, apply.coef = exp_coef, p.auto = FALSE)

While I am able to produce the exact same coefficients (save for slight differences in rounding) with the exact same significance levels, the standard errors do not match. For example, in Model 1 in the figure (first column in Natural Causes), I obtain a standard error of 0.414 rather than 0.198 for the coefficient on Legislature (0.456*). I was reading that the differences may be due to how the standard errors are transformed (something to do with the delta method perhaps). Does anyone have any advice? Thanks.

score 2 · Accepted Answer · answered Jun 14 '23 at 02:11

Quick answer is they measure different things. In your example, Stata reports approximate standard errors of HRs, while R reports SE of coefficient se(coef). I used an example from Stata manual:

library(survival)
library(webuse)

webuse("drugtr")
m_1 <- coxph(Surv(studytime, died) ~ drug + age, drugtr)
summary(m_1)

SE for 'drug' is .0477017 in Stata (SE for HR), but 0.46052 in R (SE for coefficient).

To get the same (or similar) SE values as Stata, you could borrow some functions from R packages. I used deltaMethod in car package.:

library(car)
deltaMethod(m_1, "exp(drug)")

Now it is 0.044426, similar to the value using Stata. There may be other alternatives.

How to get the exact same standard errors obtained in Stata when reproducing survival analysis in R?

1 Answers1