0

I have a data frame with several columns. The last column has NA's for, say, the first 50 rows. There are brute methods, but how do I write something that can tell where the first integer/float value starts?

structure(list(col1 = c(646, 574, 590, 671, 618, 529), col2 = c(438, 
744, 730, 748, 507, 479), col3 = c(493, 661, 651, 715, 582, 571
), col4 = c(1047, 1252, 1335, 1269, 1185, 1147), col5 = c(883, 
1008, 996, 1019, 901, 846), col6 = c(824, 840, 766, 776, 868, 
927), col7 = c(727, 685, 708, 779, 717, 721), col8 = c(NA_real_, 
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_)), .Names = c("col1", 
"col2", "col3", "col4", "col5", "col6", "col7", "col8"), row.names = c(NA, 
6L), class = "data.frame")

For the first 7 columns I iterate through, isolate the column and put it into a time series model

for(colin 1:ncol(so)){

isoColumn<- so[,col]

model<-tbats(isoColumn)
}

Is there a programming method/algorithm I can use to tell where the first value is so I can truncate those rows before I plug it into the tbats model?

DataTx
  • 1,839
  • 3
  • 26
  • 49
  • Possible duplicate of [Find the index position of the first non-NA value in an R vector?](http://stackoverflow.com/questions/6808621/find-the-index-position-of-the-first-non-na-value-in-an-r-vector) – dww Aug 06 '16 at 06:39

2 Answers2

1

You could use which(!is.na(x))[1] to locate the first non-NA value, but why not just do

models <- lapply(so,function(x) tbats(na.omit(x)))

?

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • well in some of the columns I have a couple of NA's and I rather use a linear imputation to fill in the values rather than just removing all NA's – DataTx Aug 06 '16 at 03:18
1

If dealing with large data, Position is considerably faster than which, because it only evaluates until a match is found, rather than evaluating the whole vector then subsetting

Position(function(x)!is.na(x), x)
dww
  • 30,425
  • 5
  • 68
  • 111