7

My lack of understanding of R is causing me to grind to a halt in my work and seek your help. I'm looking to build a neural network from some time series data and then build a prediction using separate data and the model returned by the trained neural network.

I created an xts containing the dependent variable nxtCl (a one-day forward closing stock price) and the independent variables (a set of corresponding prices and technical indicators).

I split the xts in two, one set being training data and the other set for testing/prediction, these are miData.train and miData.test respectively. Subsequently I altered these two xts to be scaled data frames.

miData.train <- scale(as.data.frame(miData.train))
miDate.test <- scale(as.data.frame(miData.test))

Using the package nnet I am able to build a neural network from the training data:

nn <- nnet(nxtCl ~ .,data=miData.train,linout=T,size=10,decay=0.001,maxit=10000)

The str() output for this returned formula object is:

> str(nn)
List of 18
$ n            : num [1:3] 11 10 1
$ nunits       : int 23
$ nconn        : num [1:24] 0 0 0 0 0 0 0 0 0 0 ...
$ conn         : num [1:131] 0 1 2 3 4 5 6 7 8 9 ...
$ nsunits      : num 22
$ decay        : num 0.001
$ entropy      : logi FALSE
$ softmax      : logi FALSE
$ censored     : logi FALSE
$ value        : num 4.64
$ wts          : num [1:131] 2.73 -1.64 1.1 2.41 1.36 ...
$ convergence  : int 0
$ fitted.values: num [1:901, 1] -0.465 -0.501 -0.46 -0.431 -0.485 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:901] "2005-07-15" "2005-07-18" "2005-07-19" "2005-07-20" ...
.. ..$ : NULL
$ residuals    : num [1:901, 1] -0.0265 0.0487 0.0326 -0.0384 0.0632 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:901] "2005-07-15" "2005-07-18" "2005-07-19" "2005-07-20" ...
.. ..$ : NULL
$ call         : language nnet.formula(formula = nxtCl ~ ., data = miData.train, inout = T,      size = 10, decay = 0.001, maxit = 10000)
$ terms        : language nxtCl ~ Op + Hi + Lo + Cl + vul + smaten + smafif + smath + vol + rsi + dvi
$ coefnames    : chr [1:11] "Op" "Hi" "Lo" "Cl" ...
$ xlevels      : Named list()
- attr(*, "class")= chr [1:2] "nnet.formula" "nnet"

I then try to run the prediction function using this model nn and the data I kept separate miData.test using the following function:

preds <- predict(object=nn, miData.test)

and I get the following error:

Error in terms.default(object, data = data) : 
  no terms component nor attribute

Running terms.default on miData.test I see that my data frame does not have any attributes:

terms.default(miData.test)
  Error in terms.default(miData.test) : no terms component nor attribute

but is this why the prediction will not run?

miData.test has names that match the terms of nn:

> nn$terms
nxtCl ~ Op + Hi + Lo + Cl + vul + smaten + smafif + smath + vol + 
rsi + dvi

> names(miData.test)[1] "Op"     "Hi"     "Lo"     "Cl"     "vul"    "smaten" "smafif" "smath"  "vol"    "rsi"    "dvi"    "nxtCl"

And, in terms of structure, the data is exactly the same as that which was used to build nn in the first place. I tried adding my own named attributes to miData.test, matching the terms of nn but that did not work. The str() of miData.test returns:

> str(miData.test)
'data.frame':   400 obs. of  12 variables:
$ Op    : num  82.2 83.5 80.2 79.8 79.8 ...
$ Hi    : num  83.8 84.2 83 79.9 80.2 ...
$ Lo    : num  81 82.7 79.2 78.3 78 ...
$ Cl    : num  83.7 82.8 79.2 79 78.2 ...
$ vul   : num  4.69e+08 2.94e+08 4.79e+08 3.63e+08 3.17e+08 ...
$ smaten: num  84.1 84.1 83.8 83.3 82.8 ...
$ smafif: num  86.9 86.8 86.7 86.6 86.4 ...
$ smath : num  111 111 111 110 110 ...
$ vol   : num  0.335 0.341 0.401 0.402 0.382 ...
$ rsi   : num  45.7 43.6 36.6 36.3 34.7 ...
$ dvi   : num  0.00968 0.00306 -0.01575 -0.01189 -0.00623 ...
$ nxtCl : num  82.8 79.2 79 78.2 77.4 ...

Any help or insight in getting predict() to work in this instance would be greatly appreciated. Thanks.


Here's some reproducible code. In putting this together, I have 'removed' the error. Unfortunately, although it now works, I am none the wiser as to what was causing the problem before:

    require(quantstrat)
    require(PerformanceAnalytics)
    require(nnet)
    initDate <- "2004-09-30"
    endDate <- "2010-09-30"
    symbols <- c("SPY")
    getSymbols(symbols, from=initDate, to=endDate, index.class=c("POSIXt","POSIXct"))
    rsi <- RSI(Cl(SPY))
    smaTen <- SMA(Cl(SPY))
    smaFif <- SMA(Cl(SPY),n=50)
    nxtCl <- lag(Cl(SPY),-1)
    tmp <- SPY[,-5]
    tmp <- tmp[,-5]
    miData <- merge(tmp,rsi,smaTen,smaFif,nxtCl)
    names(miData) <- c("Op","Hi","Lo","Cl","rsi","smaTen","smaFif","nxtCl")
    miData <- miData[50:1512]
    scaled.miData <- scale(miData)
    miData.train <- as.data.frame(scaled.miData[1:1000])
    miData.test <- as.data.frame(scaled.miData[1001:1463])
    nn <- nnet(nxtCl ~ .,data=miData.train,linout=T,size=10,decay=0.001,maxit=10000)
    preds <- predict(object=nn, miData.test)  
tfb
  • 267
  • 2
  • 11
  • 2
    Without `miData.train` and `miData.test`, the problem is not reproducible: your code actually works on random data. – Vincent Zoonekynd Feb 16 '12 at 23:40
  • 5
    As an aside, don't run `scale` separately on your training and test datasets. This will result in each of them being scaled differently, so your predictions will be invalid. Combine them and run `scale` once, or save the scaling factors from your training data and apply them manually to your test data. – Hong Ooi Feb 17 '12 at 00:25
  • @HongOoi thank you for the tip wrt to scaling - I can see I hadn't really thought through the effect of re-scaling separately - thanks. – tfb Feb 17 '12 at 10:48
  • 1
    @VincentZoonekynd I have edited to the above to add complete reproducible code. In constructing the reproducible code the error I was encountering has gone, so I'm thinking the shut down & restart that I did on both myself and R last night has cleared out whatever sloppy code had previously caused the problem Thanks for taking the time to look, anyway... – tfb Feb 17 '12 at 10:52
  • 1
    Check that `miData.train` and `miData.test` are equivalent in their data types via `str()`. Does it work if you do `predict(nn, newdata = miData.test)`? – Gavin Simpson Oct 18 '12 at 10:45

0 Answers0