0

I am working with daily stock data and I am trying to compute the monthly betas in month t from daily stock data on t-11 month time window (e.g. the beta in Dec comprises daily stock data from Jan up to and including Dec). Additionally, I want to include a minimum of 150 observations in the regression equation. ->Link to data screenshot

I want to calculate the beta by computing the coefficients of a regression of excess stock returns on the excess market return over the past 12 months. My sample data lists the excess returns of each stock (the stock is classified by a number) and the market return mktrf in the last column.

I came up with the following code, but unfortunately I cannot find the error: I used width=252 for the past 12 months in days, but have not included prerequisite of a minimum of 150 observations in the code yet. I also have troubles with the NAs when the stock got delisted. I have searched the forum and I can only find the same code as I have in the answers, so I don't know what I'm doing wrong.

rollingbeta <- rollapply(joined_data,
                     width=252,
                     FUN = function(x) {
                       t = lm(formula=paste0(" ` ", x , " ` ~ mktrf"), data = x, na.rm=T);
                       return(t$coef) },
                     by.column=TRUE, 
                      align="right")

Ideally, I wish to seek the output in the same data format as my input table.

Any ideas on this? Would appreciate any help!

Here is a sample created with dput:

structure(list(date = structure(c(16804, 16805, 16806, 16807, 16808, 16811, 16812, 16813, 16814, 16815), class = "Date"), 10001 = c(NA, -0.0132978723404255, 0.0148247978436657, 0.0146082337317397, 0.0196335078534031, 0.0346598202824133, 0.0235732009925558, 0, -0.0145454545454544, -0.0172201722017221), 93436 = c(NA, 8.95215075422673e-05, -0.0196482119679542, -0.0154766252739225, -0.0215627173661025, -0.0149289099526067, 0.0101996632186674, -0.0460065723674811, 0.0293045779042485, -0.00577165583470751), mktrf = c(-0.0159, 0.0012, -0.0135, -0.0244, -0.0111, -6e-04, 0.0071, -0.0267, 0.0165, -0.0214)), row.names = c(NA, 10L), class = "data.frame")

Mia
  • 13
  • 4
  • Sorry for this @G.Grothendieck. I just wanted to make sure I include enough observations for such a big rolling window. I updated the dput output for a smaller version, hope it works now! – Mia Sep 29 '19 at 07:01
  • data.table objects don't work with `dput` because they contain internal pointers which are not reproducible. Please convert it to a data frame first. – G. Grothendieck Sep 29 '19 at 13:42
  • Sorry for this, didn't realize that! I changed it now. – Mia Sep 29 '19 at 18:07

1 Answers1

0

I can't read your data but using BOD (that comes with R) we perform a rolling regression using a window of 5 rows or if 5 rows are not available then use the number of rows available provided at least 3 rows are available. Also we check that there are at least 3 complete casess.

We do the above by using a width vector rather than a single width. This vector indicates how many rows to use.. If there are at least 5 rows available we use 5 and otherwise use the number of rows available; however, if the number of rows available is less than 3 then we use 3 which will cause NAs to be generated for that row.

Within coefs we also check if there are fewer than 3 complete cases and return NAs if so.

library(zoo)

n <- nrow(BOD)  # 6
w <- pmax(pmin(1:n, 5), 3)  # 3 3 3 4 5 5

coefs <- function(x) {
  if (sum(complete.cases(x)) >= 3) coef(lm(as.data.frame(x))) else c(NA, NA)
}
rollapplyr(BOD[2:1], w, coefs, by.column = FALSE, fill = NA)

giving:

     (Intercept)     Time
[1,]          NA       NA
[2,]          NA       NA
[3,]    1.833333 5.350000
[4,]    5.450000 3.180000
[5,]    7.750000 2.030000
[6,]   10.674324 1.301351

This gives the same:

rbind(c(NA, NA), 
      c(NA, NA), 
      coefs(BOD[1:3, 2:1]), 
      coefs(BOD[1:4, 2:1]), 
      coefs(BOD[1:5, 2:1]), 
      coefs(BOD[2:6, 2:1]))

Update

The shorter dput output added gave a syntax error but this time it was short enough that I was able to edit it (see Note at end) into something that works.

We use a width of 8 or if less than 8 are available we use a minimum of 4 or else we return NAs. Also we use a minimum of 3 complete cases if coefs.

We convert DF to a zoo object and assume that the last (3rd column) is the dependent variable and the other 2 columns are the independent variables.

library(zoo)

n <- nrow(DF)  # 10
w <- pmax(pmin(1:n, 8), 4)  #  [1] 4 4 4 4 5 6 7 8 8 8


coefs <- function(x) {
  if (sum(complete.cases(x)) >= 3) coef(lm(as.data.frame(x))) else c(NA, NA)
}

z <- read.zoo(DF)[, c(3, 1, 2)]
rollapplyr(z, w, coefs, by.column = FALSE, fill = NA)

giving this zoo object:

            (Intercept)     `10001`    `93436`
2016-01-04           NA          NA         NA
2016-01-05           NA          NA         NA
2016-01-06           NA          NA         NA
2016-01-07 -0.031077046 -2.44567879 -2.7398798
2016-01-08 -0.034601587 -2.65028219 -3.2757924
2016-01-11  0.003069533  0.35951677  1.2452355
2016-01-12 -0.001737647  0.15052197  0.7341502
2016-01-13 -0.001773568  0.09143210  0.5979455
2016-01-14 -0.001173643  0.07543222  0.6164919
2016-01-15 -0.005337689  0.28461054  0.6305395

Note

DF <- structure(list(date = structure(c(16804, 16805, 16806, 16807, 
16808, 16811, 16812, 16813, 16814, 16815), class = "Date"), `10001` = c(NA, 
-0.0132978723404255, 0.0148247978436657, 0.0146082337317397, 
0.0196335078534031, 0.0346598202824133, 0.0235732009925558, 0, 
-0.0145454545454544, -0.0172201722017221), `93436` = c(NA, 8.95215075422673e-05, 
-0.0196482119679542, -0.0154766252739225, -0.0215627173661025, 
-0.0149289099526067, 0.0101996632186674, -0.0460065723674811, 
0.0293045779042485, -0.00577165583470751), mktrf = c(-0.0159, 
0.0012, -0.0135, -0.0244, -0.0111, -6e-04, 0.0071, -0.0267, 0.0165, 
-0.0214)), row.names = c(NA, 10L), class = "data.frame")
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Thank you for your help! I have tried to replicate your code with my data, but I only receive NAs. I have updated my sample data, so I hope it works now! – Mia Sep 29 '19 at 07:03