obtaining average coefficients and adj. R^2 from multiple pooled regressions using lapply

Question

I have performed multiple Pooled regressions with a loop function and stored the regression output in a list (myregression). What I would like to do now is to efficiently perform the coeftest function in the lmtest package over all my regressions (i.e., myregression list) to adjust standard errors and t-statistics. Finally I would like to obtain the mean of the coefficients, standard errors and t-values.

Here is what I came up so far:

library(plm)
data("Grunfeld", package="plm")

# Store each subset regression in myregression
myregression <- list()

count <- 1

# Regression on six-year subsets of Grunfeld
for(t in 1940:1950){

  myregression[[count]] <- plm(inv ~ value + capital, 
                              subset(Grunfeld, year<=t & year>=t-5),
                              index=c("firm","year"))

# Name each regression based on the year range included in the data subset
names(myregression)[[count]] = paste0("Year_",t)
count <- count+1
}

Here is where my problems kick in: Although i'm able to perform the coeftest function to invidiual components of the list, I'm unable to code the lapply function accordingly.

## Apply coeftest function to all plm-objects
library(lmtest)

coeftest(myregression$Year_1940, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4))
coeftest(myregression$Year_1941, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4))

COEFTEST<-lapply(myregression, coeftest(x, vcov=function(x) vcovSCC(x, type="HC3", maxlag=4)))

## obtaining average coefficients, se's,and t values over all regressions
lapply(COEFTEST, mean)

I hope there is only a minor mistake that I'm unable to see. I further noticed that the plm regression output is smaller than regular lm output is there another way to obtain mean adj. R^2?

coffeinjunky · Accepted Answer · 2014-07-18T10:56:45.553

1

Try

 COEFTEST<-lapply(myregression, function(x) coeftest(x, vcov=vcovSCC(x, type="HC3", maxlag=4)))

Thsi will give you a list of coeftest-outputs for each regression, which you can then use in whatever way you want.

As a sidenote, make sure that whatever you do with this output makes sense. Taking the mean of an coeftest-output is not obvously sensible to me. If you want to have the average over all coefficients, try something like

 mean(sapply(COEFTEST, function(x)  x["capital", "Estimate"]))

Here, sapply retrieves all estimates for the variable capital from the COEFTEST outputs and puts them into a vector.

To access other elements, it is helpful to look at the structure of the objects using str(). For instance, str(summary(myregression[[1]])) reveals that the r-squares are saved under the name r.squared. These you can access using e.g. summary(myregression[[1]])$r.squared for the first regression output. Automating this, you can again use a construct as above, e.g.

 sapply(myregression, function(x) summary(x)$r.squared)

edited Jul 18 '14 at 10:56

answered Jul 18 '14 at 08:54

coffeinjunky

11,254
39
57

Thank you very much that helps a lot! The following lapply function with lapply(COEFTEST, mean) however gives back only one value per each regression. Is it not possible to obtain average values for each attribute? – Gritti Jul 18 '14 at 09:03
Thank you again for your help! I was exactly wondering the same as i posted already this methodological question in cross-validate: http://stats.stackexchange.com/questions/107816/aggregating-pooled-regression-outputs-in-different-years, However no one replied thus far. I found multiple studies with pooled regressions in various years where they reported the following: "... reports the average coefficients and time-series Newey-West t-statistics from pooled regressions estimated each year, using the previous XX years of data" They do not provide more information on how they obtained it however – Gritti Jul 18 '14 at 09:19
I was hesitating to accept as I still don't get how to retrieve certain elements of the summary function. For instance if i use lapply(myregression, summary) I would like to acces the "Adj. R-Squared" field of each regression and apply the mean function. Is this somehow possible? I would have accepted your answer anyway as only this minor issue is unresolved. Thank you very much again! – Gritti Jul 18 '14 at 10:02
Thanks again for this very detailed answer! Thanks to this community I learned a lot in a very short time! aswome! – Gritti Jul 18 '14 at 12:15
Here is what i was able to come up thanks to your help: mean(sapply(myregression, function(x) summary(x)$r.squared["adjrsq"])) and it works ;) – Gritti Jul 18 '14 at 12:27

score 0 · Answer 2 · edited May 23 '17 at 11:57

There is a nicer way to obtain average coefficients (technically known as Mean Group estimators) using pmg() in plm. And from the looks of it, you are trying to estimate Fama-MacBeth regressions and SEs.

require(foreign)
require(plm)
require(lmtest)
test <- read.dta("http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.dta")
fpmg <- pmg(y~x, test, index=c("year","firmid")) ##Fama-MacBeth

> ##Fama-MacBeth
> coeftest(fpmg)

t test of coefficients:

            Estimate Std. Error t value Pr(>|t|)    
(Intercept) 0.031278   0.023356  1.3392   0.1806    
x           1.035586   0.033342 31.0599   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

For further details see Fama-MacBeth and Cluster-Robust (by Firm and Time) Standard Errors in R.

obtaining average coefficients and adj. R^2 from multiple pooled regressions using lapply

2 Answers2

Linked