There are many possible reasons for differences you observe. Given that you have not supplied a minimal reproducible example or any output, we can only speculate. I am the author of MatchIt
and cobalt
so I can explain the choices there (which are the same) and how I justify them.
For continuous variables, the SMD after matching is the difference in the means (weighted by the matching weights) divided by a scaling factor computed in the original sample. I have justified the choice to compute the standardization factor in the original sample here and elsewhere. The standardization factor depends on the chosen target population, but it can be changed by supplying an argument to s.d.denom
. By default, when matching for the ATT is used (the default in MatchIt
), the standardization factor is the standard deviation of the variable in the treated group (again, computed prior to matching). When matching for the ATE, the standardization factor is the square root of the average of the variances in the treatment groups. The defaults and allowable arguments are explained in help("col_w_smd")
.
For categorical variables, cobalt
first splits them into dummy variables for each category and then treats the dummy variables as independent variables. By default cobalt::bal.tab()
produces unstandardized mean differences (i.e., raw differences in proportion) for binary and categorical variables. If you want standardized mean differences, you need to set binary = "std"
. I explain in the documentation why I think standardized mean differences don't make sense for binary variables. cobalt
uses a special formula for the variance of binary variables (smd
does as well), so be sure to take that into consideration when trying to replicate cobalt
's results manually.
I am not sure exactly what smd
(which is the basis for calculations in gtsummary()
) does, because its documentation is somewhat sparse and its code (which uses an R6 architecture) is hard for me to read (though, admittedly, cobalt
's is too). It seems like smd
computes the standardization factor in the matched sample when matching weights are supplied (or only the matched sample is supplied to it), and it always computes the standardization factor as the square root of the average of the variances in the treatment groups. For categorical variables, it compute a single standardized mean difference for the whole variable using the formula described in Yang & Dalton (2012) rather than splitting the variable into separate dummy variables. I explain here why I don't think this is a great idea.
Hopefully this sheds some light on these differences. I would encourage you to use cobalt
rather than gtsummary()
for producing balance tables because of the amount of research that went into choosing these settings. They represent what, in my opinion, are best practices. cobalt
also gives you the flexibility to supply your own choices if you don't agree, but by making those choices yourself, you get to know exactly how each value is calculated. I have also worked hard to ensure cobalt
is thoroughly documented to help users understand exactly what is going on. Everything I described about cobalt
's functionality is explained in the documentation.