0

I posted this question yesterday but received some valuable feedback that my post left a little to be desired :). Here is an updated attempt that hopefully is much clearer:

I have a xts zoo object and would like to determine the t-statistic of the slope coefficient over the last 20 periods using R and then test whether that t value is > 2.

class(prices)

[1] "xts" "zoo"

tail(prices)
             IWM    SPY    TLT
2012-10-24 81.20 141.02 121.48
2012-10-25 81.53 141.43 120.86
2012-10-26 81.14 141.35 122.64
2012-10-31 81.63 141.35 123.36
2012-11-01 82.49 142.83 122.35
2012-11-02 81.19 141.56 122.26

I could not figure out how to perform a regression for each column versus time so I created a new column (daynumber) with the index and performed the regression versus the index:

lastprices = last(prices,20)
prices.data.frame = as.data.frame(lastprices)
daynumber = index(prices.data.frame)
pdfd = data.frame(prices.data.frame, daynumber)
pdfd.lm = lm(pdfd$daynumber ~ ., data=pdfd)
tstat = coef(summary(pdfd.lm))
tstat[,"t value"]
(Intercept)         IWM         SPY         TLT 
  4.5426630  -0.1788975  -1.3521969  -2.2362345 
tstattest = ifelse(tstat[,"t value"]>2,1,0)
tstattest
(Intercept)         IWM         SPY         TLT 
          1           0           0           0 

I cant help but think this is not the most efficient way to accomplish this task. Does anyone have ideas on how to just perform the regression for each column versus time without creating the daynumber column?

Thanks - take it easy on me I just started learning!

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
AdmiralF
  • 113
  • 2
  • 8
  • 2
    You are obviously working with a zoo or xts data-object, since that is not typical output for an lm-object. There are people who frequent SO that will know what you have been doing, but from the non-financial-quants in the audience, you would get more rapid replies if you included code that would create the object you a re working with or you could produce the output of `dput(head(data) )` – IRTFM Nov 03 '12 at 20:07
  • Thanks DWin. The output of dput(head(data) ) is: Error in x[seq_len(n)] : object of type 'environment' is not subsettable. The beginning of the code is as follows: tickers = spl('TLT,GLD,IWM') data <- new.env() getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign = T) for(i in ls(data)) data[[i]] = adjustOHLC(data[[i]], use.Adjusted=T) bt.prep(data, align='keep.all', dates='2004:12::') – AdmiralF Nov 04 '12 at 03:21
  • At this point I'm guessing that you are using a package that is not on CRAN, since the only package I can find with an `spl` function is MCMCglmm and that doesn't seem like a good fit to this problem. I've given up, so I didn't look for `bt.prep`. You really do need to provide the names of all the packages with functions you are using. – IRTFM Nov 04 '12 at 08:44
  • DWin - I updated the full post. Hopefully this is much clearer. Would love to hear your feedback. Thanks – AdmiralF Nov 05 '12 at 00:19

1 Answers1

0

I think your original problem had made useful progress. What I thought was an output format that came from a unnamed package was really the output of lm where there were multiple columns on hte LHS of the formula. What I have done with your second version is this:

require(zoo)
require(xts)
prices <- read.zoo(text="dt           IWM    SPY    TLT
 2012-10-24 81.20 141.02 121.48
 2012-10-25 81.53 141.43 120.86
 2012-10-26 81.14 141.35 122.64
 2012-10-31 81.63 141.35 123.36
 2012-11-01 82.49 142.83 122.35
 2012-11-02 81.19 141.56 122.26", index=1, header=TRUE, FUN=as.Date, format="%Y-%m-%d")
 prices <- as.xts(prices)
dprice <- as.data.frame(prices)
dprice$dat <- 1:6

lm( cbind(IWM,SPY , TLT) ~dat , data=dprice)
#-------------
Call:
lm(formula = cbind(IWM, SPY, TLT) ~ dat, data = dprice)

Coefficients:
             IWM        SPY        TLT      
(Intercept)   81.19800  140.90000  121.24933
dat            0.09486    0.19714    0.25971

Notice that there are no t-values in that output. the summary.lm function will create those:

tstat = coef(summary(fit))
tstat
#--------------
Response IWM :
               Estimate Std. Error     t value     Pr(>|t|)
(Intercept) 81.19800000  0.4993259 162.6152292 8.578215e-09
dat          0.09485714  0.1282151   0.7398284 5.004726e-01

Response SPY :
               Estimate Std. Error    t value     Pr(>|t|)
(Intercept) 140.9000000  0.5356109 263.064096 1.252746e-09
dat           0.1971429  0.1375322   1.433431 2.250283e-01

Response TLT :
               Estimate Std. Error   t value     Pr(>|t|)
(Intercept) 121.2493333  0.7632198 158.86556 9.417108e-09
dat           0.2597143  0.1959767   1.32523 2.557232e-01

To extract the t-values from that list of three matrices you can use lapply:

lapply( coef(summary(fit)), "[" , ,"t value")
#----------
$`Response IWM`
(Intercept)         dat 
162.6152292   0.7398284 

$`Response SPY`
(Intercept)         dat 
 263.064096    1.433431 

$`Response TLT`
(Intercept)         dat 
  158.86556     1.3252

To get just the intercepts you can use this:

 lapply( coef(summary(fit)), "[" , "(Intercept)","t value")
#-----
$`Response IWM`
[1] 162.6152

$`Response SPY`
[1] 263.0641

$`Response TLT`
[1] 158.8656

PLEASE note that the Intercept value has almost no meaning in this context. If you use the numbers 1:6 you get one intercept and if you use the Dates-classes value you get a completely different value. The t-test is on whether the intercept is zero, and since the scale is rather arbitrary it has no interpretation unless you are very sure of the coding and what a zero-value would imply.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Awesome, lapply was what I needed. Thanks so much for the response. Quick follow-up: To apply a conditional statement to the t value I tried tval <- lapply( coef(summary(fit)), "[" , ,"t value") and then ifelse(tval>2,tval,0) but I received the message: Error in ifelse(tval > 2, tval, 0) : (list) object cannot be coerced to type 'double'. How can I apply the conditional without getting the error? – AdmiralF Nov 05 '12 at 16:41
  • You need to supply `ifelse` with a vector rather than a list, which is what that call above is returning. My last version also returns a list but if you wrapped unlist() around it you would get a vector. So try: > tval <- unlist( lapply( coef(summary(fit)), "[" , "(Intercept)" ,"t value") ); ifelse(tval < 2, tval, 0 ) # Response IWM Response SPY Response TLT 0 0 0 – IRTFM Nov 05 '12 at 16:48
  • Thansk again. I took your advice: >tvaltest <- ifelse(tval>2,tval,0) >tvaltest >Response DBC Response EEM Response EFA Response GLD Response HYG Response IEF Response IWM Response IYR Response MDY Response TLT 0.000000 0.000000 0.000000 0.000000 0.000000 6.377264 0.000000 0.000000 0.000000 9.396363 # This works great but I need to now change the data back to an xts object in the format: DBC EEM EFA 2012-11-06 1.02 1.09 1.09 Any thoughts? I need to get rid of the word "Response" for each symbol and add back the date. tvaltest class = "numeric" – AdmiralF Nov 08 '12 at 15:40