Independent Sample T-Test After Weighting

Question

I am trying to run an independent sample t-test with weights computed through propensity score. Y is the outcome variable (a continuous variable) and sec is the grouping variable having two categories (code 0 and 1). I used the following command:

          wtd.t.test(Ya1, sec1, weight=weights1T)

Following results were produced.

         $test
         [1] "Two Sample Weighted T-Test (Welch)"

         $coefficients
           t.value        df   p.value 
         -25.14739 670.43022   0.00000 

         $additional
           Difference       Mean.x       Mean.y     Std. Err 
         -0.496466247  0.003533753  0.500000000  0.019742259

Now these results are not clear. I want to know the mean for the both groups. The above results also do not clarify if difference is (group 1 - group 0) or (group 0 - group 1). Simple t.test does not account for the weights. How I can deal with this problem?

you should specify what package you are using, because `wtd.t.test` is not in base stats — C8H10N4O2, Sep 22 '16 at 18:38

Ryan C. Thompson · Answer 1 · 2016-09-22T23:00:30.757

You don't specify which package the wtd.t.test function comes from, so I'll assume using the function from the "weights" package. According to the documentation, the first two arguments are the data from the two groups, and the 3rd and 4th arguments are the weights for the observations in the two groups. If the 4th argument is not supplied, the given weights will be used for both groups. This means that the code as you have written it is testing whether the weighted mean of Ya1 is different from the weighted mean of sec1. This does not seem like what you want to do. I think lm is a better fit for your use case:

# Make some example data
sec1 <- factor(sample(0:1, replace=TRUE, size=700))
Ya1 <- rnorm(700) + as.numeric(sec1)
weights1T <- 1.4^(rnorm(700))
# Use lm() to perform a weighted t-test
summary(lm(Ya1 ~ sec1, weights=weights1T))

which gives:

> summary(lm(Ya1 ~ sec1, weights=weights1T))

Call:
lm(formula = Ya1 ~ sec1, weights = weights1T)

Weighted Residuals:
    Min      1Q  Median      3Q     Max 
-3.1921 -0.6672 -0.0374  0.7025  4.4411 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.92035    0.05376   17.12   <2e-16 ***
sec11        1.11120    0.07874   14.11   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.061 on 698 degrees of freedom
Multiple R-squared:  0.222, Adjusted R-squared:  0.2209 
F-statistic: 199.1 on 1 and 698 DF,  p-value: < 2.2e-16

If you really want to use wtd.t.test, you can do so like this:

library(weights)
ysplit <- split(Ya1, sec1)
wsplit <- split(weights1T, sec1)
wtd.t.test(y1split[[1]], y1split[[2]], w1split[[1]], w1split[[2]])

which gives you nearly the same answer as lm():

> wtd.t.test(x=ysplit[[1]], y=ysplit[[2]],
+            weight=wsplit[[1]], weighty=wsplit[[2]])
$test
[1] "Two Sample Weighted T-Test (Welch)"

$coefficients
  t.value        df   p.value 
-13.50571 697.25403   0.00000 

$additional
 Difference      Mean.x      Mean.y    Std. Err 
-1.00357229  1.04628894  2.04986124  0.07430724 

Warning message:
In wtd.t.test(y1split[[1]], y1split[[2]], w1split[[1]], w1split[[2]]) :
  Treating data for x and y separately because they are of different lengths

score 1 · Answer 2 · answered Sep 22 '16 at 18:44

Looks to me like the results are right in front of you.

     $additional
       Difference       Mean.x       Mean.y     Std. Err 
     -0.496466247  0.003533753  0.500000000

Mean.x and Mean.y give you the mean for first and second group (what you are calling group 0 and 1, or Ya1 and sec1 in your code).

Difference is clearly Mean.x minus Mean.y

Independent Sample T-Test After Weighting

2 Answers2