0

I am a novice R user. I installed Zelig version 4.1-3 and Amelia II version 1.7. I am puzzled on how I can obtain the degrees of freedom, t-statistic, and f-values of combined multiply imputed data using R packages and functions.

First, I loaded Amelia and Zelig:

require(Amelia)
require(Zelig)

Then, I loaded the sample data that came with Amelia:

data(freetrade)

I created 5 imputations for this dataset using the amelia function.

a.out <- amelia(freetrade, m = 5, ts = "year", cs = "country")  

Then, to combine imputations, I used the zelig function:

z.out.imp <- zelig(tariff ~ polity + pop + gdp.pc + year + country, 
     data = a.out$imputations, model = "ls" )

However, I got coefficients that appeared to be coefficients of individual imputations, and not those of the combined set when I used this code:

summary(z.out.imp)

They were as follows:

Coefficients:
                           Value   Std. Error     t-stat      p-value
(Intercept)         2.766176e+03 6.670110e+02  4.1471215 0.0003572868
polity              1.645011e-01 3.078134e-01  0.5344183 0.5938286336
pop                -6.079963e-08 6.518429e-08 -0.9327345 0.3774275934
gdp.pc             -4.246794e-04 1.945866e-03 -0.2182470 0.8319093062
year               -1.335563e+00 3.519513e-01 -3.7947390 0.0009787456
countryIndonesia   -7.000319e+01 4.646330e+01 -1.5066343 0.1700377061
countryKorea       -8.643855e+01 4.671629e+01 -1.8502870 0.0926657863
countryMalaysia    -8.815182e+01 5.389486e+01 -1.6356256 0.1393312364
countryNepal       -8.215250e+01 5.475828e+01 -1.5002753 0.1702129176
countryPakistan    -4.349869e+01 5.149729e+01 -0.8446791 0.4238033944
countryPhilippines -8.088975e+01 5.320694e+01 -1.5202857 0.1673234716
countrySriLanka    -7.668840e+01 5.695485e+01 -1.3464771 0.2161986616
countryThailand    -7.400481e+01 5.186395e+01 -1.4269026 0.1903428838

The Amelia manual shows what some of the coefficients for the combined multiply imputed dataset should be although there is no explanation on how to obtain all of them using R (see page 46 of http://cran.r-project.org/web/packages/Amelia/vignettes/amelia.pdf)

Complete DF = 167
DF:   min   = 10.36
      avg   = 18.81
      max   = 37.62
F( 2, 10.4) = 15.50
Prob > F    = 0.0008


                    Value        Std. Error     t-stat       p-value
polity             -0.206        0.39            -0.53       0.61 
pop                -3.21 e-08    8.72e-09         3.68       0.004
gdp.pc             -0.0027       0.000644        -4.28       0.000
Intercept           32.7         2.66            12.29       0.000

Because the amelia function uses monte carlo simulations, we can expect small differences between runs. However, the huge difference in the intercept was a clue that the zelig function returned regression statistics for something else than the combined set.

The Amelia manual provides this code:

> b.out <-NULL
> se.out <-NULL
> for(i in 1:a.out$m){
+ ols.out <- lm(tariff ~ polity + pop + gdp.pc, data = a.out$imputations[[i]])
+ b.out <- rbind(b.out, ols.out$coef)
+ se.out <-rbind(se.out, coef(summary(ols.out))[,2])
+ }
> combined.results<-mi.meld(q=b.out, se = se.out)
> combined.results

I tried using it. The returned results are very close to the values and standard errors shown on page 46:

$q.mi
     (Intercept)     polity          pop       gdp.pc
[1,]    33.17325 -0.1499587 2.967196e-08 -0.002724229

$se.mi
     (Intercept)   polity          pop       gdp.pc
[1,]    2.116721 0.276968 6.061993e-09 0.0006596203

However, they do not include the t-statistic, degrees of freedom, or f-values.

Are there open-source packages or functions available in R so that I can obtain the degrees of freedom, t-statistic, and f-values without having to do manual calculations?

Thanks.

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198

1 Answers1

1

Here is an annotated and edited transcript of my attempt to reproduce your problem:

> library(Zelig)

ZELIG (Versions 4.1-3, built: 2013-01-30)

> a.out <- amelia(freetrade, idvars="country", m = 5)
Error: could not find function "amelia"

The first problem I had is that you did not mention that we need to load the Amelia package. After correcting that I tried again to run the first line:

> library(Amelia)
## (Version 1.7, built: 2013-02-10)
> a.out <- amelia(freetrade, idvars="country", m = 5)
Error in amelia(freetrade, idvars = "country", m = 5) : 
  object 'freetrade' not found

This fails because you did not say how to get the freetrade data. Guessing here:

> data(freetrade)
> a.out <- amelia(freetrade, m = 5)
Amelia Error Code:  38 
 The following variable(s) are characters: 
     country
You may have wanted to set this as a ID variable to remove it
from the imputation model or as an ordinal or nominal
variable to be imputed.  Please set it as either and
try again. 

Your example does not work because the amelia function needs to be told what to do with character variables. So I modified your example in order to get one that would actually run:

> a.out <- amelia(freetrade, idvars="country", m = 5)
> z.out.imp <- zelig(tariff ~ polity + pop + gdp.pc + year + country, 
+      data = a.out$imputations, model = "ls")

Running summary on the result gives me the combined model statistics"

# This part works just fine.
> summary(z.out.imp)

  Model: ls
  Number of multiply imputed data sets: 5 

Combined results:

Call:
lm(formula = formula, weights = weights, model = F, data = data)

Coefficients:
                           Value   Std. Error     t-stat      p-value
(Intercept)         3.294715e+03 6.425487e+02  5.1275725 1.330807e-05
polity              2.761343e-01 3.354271e-01  0.8232319 4.145813e-01
pop                -6.443769e-08 5.526885e-08 -1.1658953 2.659143e-01
gdp.pc              4.549885e-04 1.354139e-03  0.3359984 7.382138e-01
year               -1.599422e+00 3.306932e-01 -4.8365739 2.649602e-05
countryIndonesia   -7.396526e+01 4.112206e+01 -1.7986760 1.009329e-01
countryKorea       -9.673542e+01 5.036909e+01 -1.9205317 8.713903e-02
countryMalaysia    -9.271187e+01 4.998836e+01 -1.8546690 9.424041e-02
countryNepal       -8.863525e+01 4.920061e+01 -1.8015072 9.990792e-02
countryPakistan    -4.789370e+01 4.362907e+01 -1.0977475 2.960914e-01
countryPhilippines -8.548672e+01 4.662372e+01 -1.8335456 9.533829e-02
countrySriLanka    -8.446560e+01 4.939918e+01 -1.7098586 1.170170e-01
countryThailand    -8.026702e+01 4.741244e+01 -1.6929529 1.213329e-01

For combined results from datasets i to j, use summary(x, subset = i:j).
For separate results, use print(summary(x), subset = i:j).

In short, the only thing in your example that works for me is the one thing you claim does not work for you. Please post the code and output showing exactly what you did and exactly what happened, because at the moment I don't have enough information to help solve the problem.

Ista
  • 10,139
  • 2
  • 37
  • 38
  • @MarkMiller No, the estimates are not the same because a) multiple imputations will differ across runs, and b) the imputations were done differently. The imputations in my attempted reproduction are as close to the OP's example I could get to run. The imputations in the documentation were done differently than the OP's example. But my final result is the combined model statistics the OP was asking for. – Ista Mar 24 '13 at 20:32
  • These do not appear to me to be close to the estimates from page 46 of the document linked by the original poster. – Mark Miller Mar 24 '13 at 20:35
  • Also @MarkMiller the example in the documentation doesn't include country in the model, while the OP here does. There is no reason to expect my answer to reproduce the example in the manual -- it is based on the OP's example, not on the example from the manual! – Ista Mar 24 '13 at 20:47
  • 1
    Sorry. I thought from the OP's comment that the estimates on page 46 were the goal. Hopefully someone can obtain the df, etc., which seem to be the ultimate goal of the post. – Mark Miller Mar 24 '13 at 20:56
  • @MarkMiller: I do not know if you have already solved the problem. I posted a similar question yesterday and just added a possible solution here: http://stackoverflow.com/questions/16694587/how-to-get-measures-of-model-fit-aic-f-statistics-in-zelig-for-multiply-imput/16711190#16711190. Hope it helps. Cheers! – TiF May 23 '13 at 10:14