1

I am trying to do 10-fold-cross-validation in R. In each for run a new row with several columns will be generated, each column will have an appropriate name, I want the results of each 'for' to go under the appropriate column, so that at end I will be able to compute the average value for each column. In each 'for' run results that are generated belong to different columns than the previous for, therefore the names of the columns should also be checked. Is it possible to do it anyway? Or maybe it would be better to just compute the averages for the columns on the spot?

for(i in seq(from=1, to=8200, by=820)){
    fold <- df_vector[i:i+819,]
    y_fold_vector <- df_vector[!(rownames(df_vector) %in% rownames(folding)),]
    alpha_coefficient <- solve(K_training, y_fold_vector)
    test_points <- df_matrix[rownames(df_matrix) %in% rownames(K_training), colnames(df_matrix) %in% rownames(folding)]
    predictions <- rbind(predictions, crossprod(alpha_coefficient,test_points))
}
moritf
  • 11
  • 2

1 Answers1

1

You are having problems with the operator precedence of dyadic operators in R should be:

 fold <- df_vector[ i:(i+819), ]

Consider:

> i=1
> i:i+189
[1] 190

Lack of a simple example (or any comments on what your code is supposed to be doing) prevents any testing of the rest of the code, but you can find the precedence of operators at ?Syntax. Unary "=" is higher, but binary "+" is lower than ":".

(It's also unclear what the folding vector is supposed to be. You only defined a fold value and it wasn't a vector since you addressed it as you would a dataframe.)

IRTFM
  • 258,963
  • 21
  • 364
  • 487