1

I'm using the Covariate Balancing Propensity Score (CBPS) package and I want to estimate robust standard errors for my ATT results that incorporate the weights. The MatchIt and twang tutorials both recommend using the survey package to incorporate weights into the estimate of robust standard errors, and it seems to work:

design.CBPS <- svydesign(ids=~1, weights=CBPS.object$weights, data=SUCCESS_All.01)
SE <- svyglm(dv ~ treatment, design = design.CBPS)

Additionally, the survey SEs are substantially different from the default lm() way of estimating coefficient and SE provided by the CBPS package. For those more familiar with either the CPBS or survey packages, is here any reason why this would be inappropriate or violate some assumption of the CBPS method? I don't see anything the CBPS documentation about how to best estimate standard error so that's why I'm slightly concerned.

1 Answers1

0

Sandwich (robust) standard errors are the most commonly use standard errors after propensity score weighting (including CBPS). For the ATE, they are known to be conservative (too large), and for the ATT, they can be either too large or too small. For parametric methods like CBPS, it is possible to use M-estimation to account for both the estimation of the propensity scores and the outcome model, but this is fairly complicated, especially for specialized models like CBPS.

The alternative is to use the bootstrap, where you bootstrap both the propensity score estimation and estimation of the treatment effect. The WeightIt documentation contains an example of how to do bootstrapping to estimate the confidence interval around a treatment effect estimate.

Using the survey package is one way to get robust standard errors, but there are other packages you can use, such as the sandwich package as recommended in the MatchIt documentation. Under no circumstance should you use or even consider the usual lm() standard errors; these are completely inaccurate for inverse probability weights. The AsyVar() function in CBPS seems like it should provide valid standard errors, but in my experience these are also wildly inaccurate (compared to a bootstrap); the function doesn't even get the treatment effect right.

I recommend you use a bootstrap. It may take some time (you ideally want around 1000 bootstrap replications), but these standard errors will be the most accurate.

Noah
  • 3,437
  • 1
  • 11
  • 27
  • Thanks you so much for this detailed overview -- I will check out bootstrapping with WeightIt next. – Cavan Bonner May 20 '22 at 23:03
  • Hi Noah, have you ever had an issue with the error `Error during wrapup: contrasts can be applied only to factors with 2 or more levels` for the `WeightIt` bootstrap code? When I ran your code with R = 10 to estimate the total runtime I didn't get the error, but did with R = 1000. Already checked to make sure the treatment factors did indeed have two levels and no NAs. It looks like this might be a bug with the boot() function itself (see [footnote 231](https://bookdown.org/anshul302/HE902-MGHIHP-Spring2020/Boot.html#fn231) of this tutorial), but it's unclear how to address that w this code. – Cavan Bonner May 26 '22 at 04:08
  • @CavanBonner this might happen if a predictor has sparse categories and they don't appear in every bootstrap replication. For example, if a bootstrap replication only has men, including a gender variable will yield that error. You can do some checks to prevent this from happening, but they will require additional programming within the loop. Another option is the [weighted fractional bootstrap](https://doi.org/10.1080/00031305.2020.1731599), which was designed to avoid this problem, but I don't know an R implementation. – Noah May 26 '22 at 04:36
  • Also it might help if your nominal variables are coded as factors and not characters. – Noah May 26 '22 at 04:37
  • Ah ok, yeah it definitely is because some of my binary predictors have very low base rates. Thanks for the reference the weighted fractional bootstraps, it looks like they included some R scripts in the supplementary materials, but no off-the-shelf function yet. – Cavan Bonner May 26 '22 at 19:12
  • If you use it, supply the bootstrap weights to the `s.weights` argument of `weightit()` and multiply them by the estimated propensity score weights in each sample to estimate the treatment effect. – Noah May 26 '22 at 19:29
  • @CavanBonner I wrote the `fwb` package to implement the fractional weighted bootstrap, so you might consider giving that a try. – Noah Oct 19 '22 at 14:54
  • Hi Noah, can we output the matched results from R to csv, and then run the regression using the weights produced by `MatchIt` or `CBPS` in Stata? This is because I would also like to do reg on the panel data clustering with two way fixed effect, which is not supported in R. – Jacob2309 Mar 22 '23 at 13:45
  • @Jacob2309 You can export the `match.data()` output when using `MatchIt` or the weights from `CBPS()` or `WeightIt::weightit()`. But that method is supported in R. You can simply include the fixed effects variables as categorical predictors in `lm()` or `glm()` or use `fixest::feols()` or `fixest::feglm()` and supply the fixed effects. if you need to do multi-way clustering of standard errors, that is supported in R, too. Very little functionality in Stata is not available in R. This might be useful if you are a Stata user: https://stata2r.github.io/ – Noah Mar 22 '23 at 15:44
  • @Noah, thanks. That looks like a decent website :). Just to confirm, the usage of matching is actually the usage of the weights, similar to [https://cran.r-project.org/web/packages/MatchIt/vignettes/MatchIt.html#reporting-results](`MatchIt'), right? In other words, it is actually a weighted regression utilizing the weights produced by `cbps` or `MatchIt`. If it is using `MatchIt`, when we run `matchit`, we filter out control group samples that are not matched to a treatment group unit. But when we run `npCBGPS` in `cbps` for continuous treatment, given the treatment is continuous, there... – Jacob2309 Mar 23 '23 at 02:59
  • there is no control group whose treatment value is 0, so essentially we only use the weights produced by the `cbps` to run a weighted regression, right? – Jacob2309 Mar 23 '23 at 02:59
  • 1
    @Jacob2309 Yes that's right. Note there are many other options for estimating weights for continuous treatments available in `WeightIt`, so don't use `CBPS` just because it's available. Colleagues and I just developed a new method that performs well: http://arxiv.org/abs/2107.07086. The R package is `independenceWeights` but it's also available in the development version of `WeightIt`. If you have more questions about weighting for continuous treatments, ask a new question since this is a bit off topic. – Noah Mar 23 '23 at 14:57