I am currently trying to calculate stock returns over varying time frames (1, 5, 20, 50, 200, 250 days) for which I couldn't find a convenient solution yet. Quantmod only offers pre-set returns as far as I know.
Hence, I have used a solution based on a lag question in stackoverflow that I amended in order to get returns, but not differences with the following function:
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}
My problem now is that I cannot use this in a xts series, as apparently the result calculates two values: One with the intended n-lag and another one with the current value in the denominator. Interestingly, only the correct value shows up in the dataframe. So I calcualte as follows:
#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)
And that gives me the following values:
Date Close Volume R1 R5
1 2010-01-04 523.96 137055000 NA NA
2 2010-01-05 523.64 168916800 -0.0006107336 NA
3 2010-01-06 524.33 145659600 0.0013176992 NA
4 2010-01-07 523.83 182195400 -0.0009535979 NA
5 2010-01-08 525.55 214804700 0.0032835080 NA
6 2010-01-11 525.93 189962700 0.0007230520 3.759829e-03
7 2010-01-12 517.59 191580300 -0.0158576236 -1.155374e-02
8 2010-01-13 519.71 185076700 0.0040959060 -8.811245e-03
9 2010-01-14 522.48 167065200 0.0053298955 -2.577172e-03
10 2010-01-15 513.14 208268000 -0.0178762823 -2.361336e-02
11 2010-01-18 516.37 112098400 0.0062945785 -1.817732e-02
12 2010-01-19 520.56 159323200 0.0081143366 5.738132e-03
13 2010-01-20 510.21 167641400 -0.0198824343 -1.827943e-02
14 2010-01-21 501.77 190062800 -0.0165422081 -3.963788e-02
15 2010-01-22 496.67 240544400 -0.0101640194 -3.209650e-02
16 2010-01-25 491.91 199198900 -0.0095838283 -4.736913e-02
17 2010-01-26 494.76 188213100 0.0057937428 -4.956201e-02
18 2010-01-27 492.25 193048200 -0.0050731668 -3.520119e-02
19 2010-01-28 484.26 229885500 -0.0162315896 -3.489647e-02
20 2010-01-29 489.82 252945300 0.0114814356 -1.379185e-02
When I directly type the formula in the console, then the daily returns look as follows:
lag-1 lag0
1 NA NA
2 -0.0006107336 -0.0006111069
3 0.0013176992 0.0013159651
4 -0.0009535979 -0.0009545081
5 0.0032835080 0.0032727619
Obviously, as the values (even though not appearing as such) actually have two values, I cannot turn them into a xts object afterwards. Without the xts obejct I cannot run my time-series analysis. The probleme definitely the denominator, but I need the c(-n, 0) in order to get the right calculation. I tried multiple ways like
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), n))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, c(-n,0)))}
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(x, n))}
And nothing really worked, so the version on top (also again below) is the only one that provides the correct value, however cannot be processed afterwards... Does anyone have a solution to this that supresses or delates the 0-lag?
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}
Before any manipulation, the output of cCDAX with dput(head(cCDAX, 20)) looks as follows:
> dput(head(cCDAX, 20))
structure(list(Date = structure(c(1262559600, 1262646000, 1262732400,
1262818800, 1262905200, 1263164400, 1263250800, 1263337200, 1263423600,
1263510000, 1263769200, 1263855600, 1263942000, 1264028400, 1264114800,
1264374000, 1264460400, 1264546800, 1264633200, 1264719600), class = c("POSIXct",
"POSIXt"), tzone = ""), Close = c(523.96, 523.64, 524.33, 523.83,
525.55, 525.93, 517.59, 519.71, 522.48, 513.14, 516.37, 520.56,
510.21, 501.77, 496.67, 491.91, 494.76, 492.25, 484.26, 489.82
), Volume = c(137055000L, 168916800L, 145659600L, 182195400L,
214804700L, 189962700L, 191580300L, 185076700L, 167065200L, 208268000L,
112098400L, 159323200L, 167641400L, 190062800L, 240544400L, 199198900L,
188213100L, 193048200L, 229885500L, 252945300L)), row.names = c(NA,
20L), class = "data.frame")
The procedure I am currently running looks as follows (I omit different lags):
library(vars)
library(fpp2)
library(nortest)
library(ggpubr)
library(xts)
library(highfrequency)
library(quantmod)
library(pracma)
library(zoo)
# clear all
rm(list=ls())
#Moving Average Function
mav = function(x,n){filter(x, rep(1/n,n), sides = 1)}
#Standard Deviation Function
vari = function(x,n){rollapply(x, width = n, FUN = sd, fill = NA, align = c("right"))}
#Return function
sret = function(x,n){(apply(lag(zoo(x), c(-n,0), na.pad = TRUE), 1L, diff)/lag(zoo(x), c(-n,0)))}
#Loading data, transfering the Date column in an actual date
cCDAX = read.csv("./CDAX_Clean.csv", header=TRUE, sep=",", dec=".")
cCDAX$Date = as.POSIXct(cCDAX$Date, format = "%d.%m.%Y")
#Adding moving averages for the closing prices
cCDAX$MA5C = mav(cCDAX[,"Close"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5C = vari(cCDAX $Close, 5)
#Calculate returns
cCDAX$R1 = sret(cCDAX$Close, 1)
cCDAX$R5 = sret(cCDAX$Close, 5)
#Calculate standard deviation of returns
cCDAX$SD5R = vari(cCDAX $R1, 5)
#Adding moving averages for the daily volumes
cCDAX$MA5V = mav(cCDAX[,"Volume"], 5)
#Calculate standard deviations for the closing prices
cCDAX$SD5V = vari(cCDAX$Volume, 5)
#Calculate change in daily volume
cCDAX$VC1 = sret(cCDAX$Volume, 1)
cCDAX$VC5 = sret(cCDAX$Volume, 5)
#Calculate standard deviation of volume change
cCDAX$SD5VC = vari(cCDAX $VC1, 5)
#Creating a time series; with omitted variables should be [,2:13] instead of [,2:45]
CDAX_ts = as.xts(cCDAX[,2:45], order.by = cCDAX[,1])