0

I'm having trouble referencing conditions that take place in multiple rows using a for loop.

The idea is as follows. There is a dataframe with columns LastPrice and KCT. Want to add SignalBinary column to the dataframe, and if

1) LastPrice[j] > KCT[j] in any 3 consecutive rows, &

2) LastPrice[j+1] - LastPrice [j+1+3] > 12 in any of rows j+1 to j+1+10 (i.e. 10 rows below)

then want to record a 1 in SignalBinary[i].

df <- data.frame(nrow =20, ncol =2)
df <- data.frame(LastPrice = c(1221,1220,1220,1217,1216,1218,1216,1216,1217,1220,1219,1218,1220,1216,1217,1218,1218,1207,1206,1205), KCT = c(1218,1218,1219,1218,1221,1217,1217,1216,1219,1216,1217,1216,1219,1217,1218,1217,1217,1217,1219,1217))
df$SignalBinary <-for(j in1:20){for(i in1:10){ifelse (df$LastPrice[j]> df$KCT[j]& df$LastPrice[j+1]> df$KCT[j+1]& df$LastPrice[j+2]> df$KCT[j+2]& df$LastPrice[j+i]- df$LastPrice[j+i+3]>12,1,0)}}

Based on the data, would have expected the code to record a 1 in rows 10 and 11, and 0s in the rest. But I'm doing something wrong. Running the code does not give an error message, but it does not create df$SignalBinary. Running df$SignalBinary says NULL.

BTW the purpose of this is to apply the code to a large database of prices, to run statistics on binary signals.

Hope someone may help. Thank you very much

Krug
  • 1,003
  • 13
  • 33
  • Here's one problem, think about when i=20 and j=10 when you access df$LastPrice[j+i]. Your trying to access row 30 from a data set with only 20 rows. ifelse() doesn't throw an error on this – Steve Bronder Apr 07 '16 at 00:55
  • You are right. Yet don't know how to fix that unfortunately. – Krug Apr 08 '16 at 03:50

2 Answers2

0

One thing that is incorrect is that you aren't returning anything from your ifelse statement (you currently have 1 and 0 as actions if a condition is (isn't) met). I think (but don't quote me) that I have solved your problem in a simpler way, without the use of nested for looping.

df <- data.frame(nrow = 20, ncol = 2)
df <- data.frame(LastPrice = c( 1221, 1220, 1220, 1217, 1216,  1218 , 1216, 1216, 1217, 1220,     1219, 1218, 1220, 1216, 1217, 1218, 1218, 1207, 1206, 1205), KCT = c( 1218, 1218, 1219, 1218, 1221,  1217 , 1217, 1216, 1219, 1216, 1217, 1216, 1219, 1217, 1218, 1217, 1217, 1217, 1219, 1217))

df$SignalBinary <- as.numeric(df$LastPrice >= df$KCT & 
                      c(rep(FALSE ,3), diff(df$LastPrice, lag=3) >= 3)) 

So I set up two conditions that have to be met for the vector. First, df$LastPrice has to be greater than (or equal to) df$KCT. Secondly, the lagged difference between in df$LastPrice has to be greater than or equal to 3. I pad the first 3 values with FALSE to make the two vectors of comparable length. If both conditions are met, it records TRUE which I convert to numeric and then place the new column in the data.frame.

Simply replace the values you in your toy example with the values you need for your application, and I think this should work.

Tad Dallas
  • 1,179
  • 5
  • 13
  • Thanks a lot for the help!! Using "diff" like that does the difference between one row and 3 rows prior. I need the difference between one row and every one of the 3 rows prior. Could repeat the diff function 3 times, but only the simplified problem presented here compares against 3. In reality want to compare against at least 20, in some cases many more, so don't think there is a way around the for loop. Furthermore, you left out the third (trickier) condition (i.e. to look for the difference only in the 10 rows prior, rather than in every row). Thanks anyways, gave me some food for thought. – Krug Apr 08 '16 at 04:08
0

Solved! Posting the solution. Was considerably more complicated than I thought it would be. Had to change the size of the StrongMove from 12 to 3 otherwise would get no signals given the data I provided on this example.

#Data
df <- data.frame(LastPrice = c( 1221, 1220, 1220, 1217, 1216,  1218 , 1216, 1216, 1217, 1220, 1219, 1218, 1220, 1216, 1217, 1218, 1218, 1207, 1206, 1205), KCT = c( 1218, 1218, 1219, 1218, 1221,  1217 , 1217, 1216, 1219, 1216, 1217, 1216, 1219, 1217, 1218, 1217, 1217, 1217, 1219, 1217))

#Define inputs
StrongMoveWindow = 10     # up to this far below the current row
StrongMoveDur = 3         # check row against another this far down
StrongMoveSize = 3        # for a difference at least this big
PvsKCTDur = 3

#Set variables and define loop boundaries
base_rows = 1:(nrow(df) - StrongMoveDur)  # can't check more than this
candidate_max = pmin(base_rows + StrongMoveWindow, nrow(df) - StrongMoveDur) # for a given base row, this is the maximum row to start checking against
df$StrongMove = rep(NA, nrow(df))
df$SignalBinary = rep(NA, nrow(df)) # pre-allocate a vector of results

#Make StrongMove variable
for (i in seq_along(base_rows)) {
  df$StrongMove[i] = as.numeric(
    any(
      df$LastPrice[(i + 1):candidate_max[i]] - 
        df$LastPrice[((i + 1):candidate_max[i]) + StrongMoveDur] > StrongMoveSize))}

#Make ContPvsKCT variable
library(data.table)
setDT(df)
df[, SingPvsKCT := as.integer(LastPrice > KCT)]
df[, ContPvsKCT := do.call(pmin, shift(SingPvsKCT, 0:(PvsKCTDur-1), type="lead"))]

#Make SignalBinary variable
df$SignalBinary <- ifelse (df$ContPvsKCT == 1 & df$StrongMove == 1, 1, 0)

Big thanks to @Gregor @HubertL @Chris @Psidom @brittenb @Frank

Krug
  • 1,003
  • 13
  • 33