0

I would like to loop over a data_all_long in long format that has tuples identified by a stockID. With unique() I stored all the different IDs in the data frame "uniqueIDs". "Numberofrows" is the length of the list containing the time series tuples from above.

Ideally the tmp variable should store all the data for one specific ID out of the long list temporarily to calculate the regression for one specific ID and store it into a vector. The overall outcome should be a vector with all the regression coefficients for the different IDs.

for(i in uniqueIDs){
  for(j in 1:numberofrows){
      tmp <- rbind(tmp,filter(data_all_long, stockId == i))
  }
  beta[,i] <- lm(mrf ~ stockreturn, data = tmp) 
}

Does anyone here have any ideas?

matthias_h
  • 11,356
  • 9
  • 22
  • 40
Rbeginner
  • 29
  • 5

1 Answers1

1

For what I understand of the problem, the following might do it.
The trick is to split the data by ID and sapply an anonymous function to each of the resulting data frames. This function will fit the models and extract the coefficients.

sp <- split(data_all_long, data_all_long$ID)
beta <- sapply(sp, function(tmp){
  fit <- lm(mrf ~ stockreturn, data = tmp)
  coef(fit)
})
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Your solution works very well for my problem, big thanks for that. I would have another question on this in the comments. Maybe you could have a look. – Rbeginner Mar 18 '20 at 19:02