1

I am currently trying to calculate stock returns over varying time frames (1, 5, 20, 50, 200, 250 days) for which I couldn't find a convenient solution yet. Quantmod only offers pre-set returns as far as I know.

Hence, I have used a solution based on a lag question in stackoverflow that I amended in order to get returns, but not differences with the following function:

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

My problem now is that I cannot use this in a xts series, as apparently the result calculates two values: One with the intended n-lag and another one with the current value in the denominator. Interestingly, only the correct value shows up in the dataframe. So I calcualte as follows:

#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)

And that gives me the following values:

    Date        Close   Volume      R1              R5
1   2010-01-04  523.96  137055000   NA              NA
2   2010-01-05  523.64  168916800   -0.0006107336   NA
3   2010-01-06  524.33  145659600   0.0013176992    NA
4   2010-01-07  523.83  182195400   -0.0009535979   NA
5   2010-01-08  525.55  214804700   0.0032835080    NA
6   2010-01-11  525.93  189962700   0.0007230520    3.759829e-03
7   2010-01-12  517.59  191580300   -0.0158576236   -1.155374e-02
8   2010-01-13  519.71  185076700   0.0040959060    -8.811245e-03
9   2010-01-14  522.48  167065200   0.0053298955    -2.577172e-03
10  2010-01-15  513.14  208268000   -0.0178762823   -2.361336e-02
11  2010-01-18  516.37  112098400   0.0062945785    -1.817732e-02
12  2010-01-19  520.56  159323200   0.0081143366    5.738132e-03
13  2010-01-20  510.21  167641400   -0.0198824343   -1.827943e-02
14  2010-01-21  501.77  190062800   -0.0165422081   -3.963788e-02
15  2010-01-22  496.67  240544400   -0.0101640194   -3.209650e-02
16  2010-01-25  491.91  199198900   -0.0095838283   -4.736913e-02
17  2010-01-26  494.76  188213100   0.0057937428    -4.956201e-02
18  2010-01-27  492.25  193048200   -0.0050731668   -3.520119e-02
19  2010-01-28  484.26  229885500   -0.0162315896   -3.489647e-02
20  2010-01-29  489.82  252945300   0.0114814356    -1.379185e-02

When I directly type the formula in the console, then the daily returns look as follows:

          lag-1          lag0
1            NA            NA
2 -0.0006107336 -0.0006111069
3  0.0013176992  0.0013159651
4 -0.0009535979 -0.0009545081
5  0.0032835080  0.0032727619

Obviously, as the values (even though not appearing as such) actually have two values, I cannot turn them into a xts object afterwards. Without the xts obejct I cannot run my time-series analysis. The probleme definitely the denominator, but I need the c(-n, 0) in order to get the right calculation. I tried multiple ways like

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), n))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, c(-n,0)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, n))}

And nothing really worked, so the version on top (also again below) is the only one that provides the correct value, however cannot be processed afterwards... Does anyone have a solution to this that supresses or delates the 0-lag?

sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

Before any manipulation, the output of cCDAX with dput(head(cCDAX, 20)) looks as follows:

> dput(head(cCDAX, 20))
structure(list(Date = structure(c(1262559600, 1262646000, 1262732400, 
1262818800, 1262905200, 1263164400, 1263250800, 1263337200, 1263423600, 
1263510000, 1263769200, 1263855600, 1263942000, 1264028400, 1264114800, 
1264374000, 1264460400, 1264546800, 1264633200, 1264719600), class = c("POSIXct", 
"POSIXt"), tzone = ""), Close = c(523.96, 523.64, 524.33, 523.83, 
525.55, 525.93, 517.59, 519.71, 522.48, 513.14, 516.37, 520.56, 
510.21, 501.77, 496.67, 491.91, 494.76, 492.25, 484.26, 489.82
), Volume = c(137055000L, 168916800L, 145659600L, 182195400L, 
214804700L, 189962700L, 191580300L, 185076700L, 167065200L, 208268000L, 
112098400L, 159323200L, 167641400L, 190062800L, 240544400L, 199198900L, 
188213100L, 193048200L, 229885500L, 252945300L)), row.names = c(NA, 
20L), class = "data.frame")

The procedure I am currently running looks as follows (I omit different lags):

library(vars)
library(fpp2)
library(nortest)
library(ggpubr)
library(xts)
library(highfrequency)
library(quantmod)
library(pracma)
library(zoo)

# clear all
rm(list=ls())

#Moving Average Function
mav = function(x,n){filter(x, rep(1/n,n), sides = 1)}
#Standard Deviation Function
vari = function(x,n){rollapply(x, width = n, FUN = sd, fill = NA, align = c("right"))}
#Return function
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}

#Loading data, transfering the Date column in an actual date
cCDAX = read.csv("./CDAX_Clean.csv", header=TRUE, sep=",", dec=".")
cCDAX$Date = as.POSIXct(cCDAX$Date, format = "%d.%m.%Y")

#Adding moving averages for the closing prices
cCDAX$MA5C = mav(cCDAX[,"Close"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5C = vari(cCDAX $Close, 5)
#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)
#Calculate standard deviation of returns
cCDAX$SD5R = vari(cCDAX $R1, 5)
#Adding moving averages for the daily volumes
cCDAX$MA5V = mav(cCDAX[,"Volume"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5V = vari(cCDAX$Volume, 5)
#Calculate change in daily volume
cCDAX$VC1 = sret(cCDAX$Volume, 1)
cCDAX$VC5 = sret(cCDAX$Volume, 5)
#Calculate standard deviation of volume change
cCDAX$SD5VC = vari(cCDAX $VC1, 5)

#Creating a time series; with omitted variables should be [,2:13] instead of [,2:45]
CDAX_ts = as.xts(cCDAX[,2:45], order.by = cCDAX[,1]) 
DC93
  • 13
  • 5

2 Answers2

2

I might be missing something but is the code below what the question asks for? It defines a generic sret and a method for objects of class "xts". A methd for objects of class "zoo" (but xts) could be defined in the same way. Then it's just a matter of calling the function.

library(xts)
library(quantmod)

sret <- function(x, ...) UseMethod("sret", x)
sret.default <- function(x, n = 1){
  m <- length(x)
  y <- rep(NA_real_, m)
  y[(n + 1):m] <- x[seq_len(m - n)]
  (x - y)/y
}
sret.data.frame <- function(x, n){
  i <- sapply(x, is.numeric)
  x[i] <- lapply(x[i], sret.default, n = n)
  x
}
sret.xts <- function(x, n = 1){
  y <- lag(x, n, na.pad = TRUE)
  (x - y)/y
}

getSymbols("AAPL", from = Sys.Date() - 20)
sret(AAPL, 5)
sret(AAPL$AAPL.Open, 2)
#              AAPL.Open
#2021-03-15           NA
#2021-03-16           NA
#2021-03-17  0.021281733
#2021-03-18 -0.022949219
#2021-03-19 -0.034612185
#2021-03-22 -0.021191681
#2021-03-23  0.027811562
#2021-03-24  0.020273555
#2021-03-25 -0.031704877
#2021-03-26 -0.020523490
#2021-03-29  0.017344850
#2021-03-30 -0.001998143
#2021-03-31  0.000000000
#2021-04-01  0.028707770
Rui Barradas
  • 70,273
  • 8
  • 34
  • 66
  • Not sure whether its my lack of R understanding, but it still does not fully work unfortunately, please let me know if you need more info. It actually is (x-y)/y and it worked for the returns nicely, if I turn the dataframe into a xts object first. The problem if I turn the dataframe into a xts object first, is that further manipulations do not work anymore. One example is the following error: "$ operator is invalid for atomic vectors", also screws up the dataframe if I use ["Close"] or [,1]. So I guess its necessary to turn the dataframe in a xts object after all the calculations. – DC93 Apr 03 '21 at 21:05
  • @DC93 Can you edit the question with an example data set? Post the output of `dput(head(cCDAX, 20))`, please. In the question there is no mention to data.frames, only to `xts` and `zoo` objects. – Rui Barradas Apr 03 '21 at 21:14
  • Thanks for clarifying, I mentioned a dataframe right in the beginning, thought its clear from that. Added my (short of repititions) code and the dput output above. Thanks! – DC93 Apr 04 '21 at 10:23
  • @DC93 Done, I think. See now after the edit. I have added two methods, the default method (which I shouldn't have missed) and a `sret.data.frame` method. – Rui Barradas Apr 04 '21 at 11:18
  • AMAZING! Thanks a lot! That worked! I changed again the denominator to y instead of x in order to get the right value (you might correct for that) and could comment out the data.frame and .xts versions. The default one did the job just fine! Any further reading in how you solved that would be much appreciated in order to get a better understanding. – DC93 Apr 04 '21 at 19:36
1

1) diff.zoo and diff.xts have an arithmetic= argument which defaults to TRUE but if FALSE it takes the ratio rather than difference. It can also be used on all columns at once.

library(quantmod)
getSymbols("FB")

k <- 2

diff(Ad(FB), k, arith = FALSE) - 1   # returns over k days of Adjusted Close

diff(FB, k, arith = FALSE) - 1   # returns over k days for each column

2) This also works and only uses base R. If you don't need padding omit the NA part in the examples. k is the number of days for which a return is wanted.

xx <- c(5, 3, 6, 2, 1)
k <- 2
c(rep(NA, k), exp(diff(log(xx), 2)) - 1) 

or on multiple columns at once. BOD is a data frame that comes with R.

rbind(matrix(NA, k, ncol(BOD)), exp(diff(log(as.matrix(BOD)), k)) - 1)
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • The first solution worked with me in the console, however applied to my dataframe it didn't unfortunately. The second one works! You might include the k in the exponential function as well, in order to have consistency. – DC93 Apr 05 '21 at 08:21
  • If you want to apply it to a data frame convert the data frame to matrix first and then convert back at the end: as.data.frame(diff(as.matrix(BOD), 2)) . Be sure that your data frame is all numeric since it does not make sense to apply diff to other types. – G. Grothendieck Apr 05 '21 at 11:21