0

I am hoping to efficiently combine my regressions using plyr functions. I have data frames with monthly data for multiple years in format yDDDD (so y2014, y2013, etc.)

Right now, I have the below code for one of those dfs, y2014. I am running the regressions by month, as desired within each year.

modelsm2= by(y2014,y2014$Date,function(x) lm(y~,data=x))
summarym2=lapply(modelsm2,summary)
coefficientsm2=lapply(modelsm2,coef)
coefsm2v2=ldply(modelsm2,coef) #to get the coefficients into an exportable df

I have several things I'd like to do and I would really appreciate your help!

A. Extract the r^2 for each model. I know that for one model, you can do summary(model)$r.squared to get it, but I have not had luck with my construct.

B. Apply the same methodology in a loop-type structure to get the models to run for all of my data frames (y2013 and backwards)

C. Get the summary into an easily exportable (to Excel) format --> the ldply function does not work for the summaries.

Thanks again.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Z_D
  • 797
  • 2
  • 12
  • 30

1 Answers1

0

A. You need to subset out the r.squared values from your summaries:

lapply(summarym2,"[[","r.squared")

B. Put all your data into a list, and put another lapply around it, eg:

lapply(list(y2014,y2013,y2012), function(dat)
                                   by(dat,dat$Date, function(x) lm(y~.,data=x))
      )

You will then have a list of lists so for instance to extract the summaries, you would use:

lapply(lmlist,lapply,summary)

C. summary returns a fairly complex data structure that cannot be coerced into a data.frame. The result you see is a consequence of the print method for it. You can use capture.output to get a charactor vector of each line of the output that you may use to write to a file.

James
  • 65,548
  • 14
  • 155
  • 193
  • Thank you, A works great and understood on C. For B, however, can you please elaborate? How do I avoid atomic vector problems within my by function that's in the lapply: the below does not work for me: models=lapply(dflist,by(dflist[],dflist[,1],function(x)lm(y~,data=x))) – Z_D Jul 02 '14 at 16:49
  • Thank you very, very much for your help. Helped a lot with my beginner understanding of the plyr commands. – Z_D Jul 02 '14 at 18:06