How do get this function to run on multiple cores/processors in R?

Question

How would I go about running this function across multiple cores on my machine to speed it up. The function implements a backfitting algorithm to fit a nonparametric bivariate linear model using functional data.

The ffunopare.knn.gcv() function implements the functional Nadaraya-Watson estimator to estimate the regression function.

library(bbefkr)

backfit = function(X1, X2, Y, eps){
  # fix r(x1) and r(x2)
  rx1_storage = list()
  rx2_storage = list()
  
  rx1_init = matrix(1, nrow=1000, ncol=100)
  rx2_init = rx1_init
  
  rx1_storage[[1]] = rx1_init
  rx2_storage[[1]] = rx2_init
  
  i=2
  
  repeat{
    a = Y_RESPONSE3 - rx1_storage[[i-1]]
    
    # a~r(x2)
    a_func_of_x2 = ffunopare.knn.gcv(RESPONSES=a, CURVES=X2,
                                     PRED=X2,q=4,semimetric="pca") # Y - r(X1)
    rx2 = a_func_of_x2$Estimated.values
    # we get r(x2) 
    
    b = Y_RESPONSE3 - rx2
    
    # b~r(x1)
    b_func_of_x1 = ffunopare.knn.gcv(RESPONSES=b, CURVES=X1,
                                     PRED=X1, q=4,semimetric="pca")
    # we get r(x1)
    rx1 = b_func_of_x1$Estimated.values
    
    rx1_storage[[i]] = rx1
    rx2_storage[[i]] = rx2
    
    dmax_rx1 = max(abs(rx1_storage[[i]] - rx1_storage[[i-1]]))
    dmax_rx2 = max(abs(rx2_storage[[i]] - rx2_storage[[i-1]]))
    max_dist = max(c(dmax_rx1, dmax_rx2)) # want the maximum of this vector to be <= epsilon
    
    print(max_dist)
    if(max_dist <= eps) 
      break # if it converges, break the loop
    #print(max_dist)
    i = i+1 # update i
  }
  return(list(rx1=rx1_storage[[i]], rx2=rx2_storage[[i]],
              iterations=i-1))
}

The iterations cannot be run in parallel, because iteration `i+1` depends on the result of iteration `i`... unless I am misunderstanding your code/question? — Mikael Jagan, Nov 12 '21 at 17:55
Stray comments: (1) Iteration `i+1` does not depend on `rx1_storage[[j]]` or `rx2_storage[[j]]` for `j` less than `i`, so it is unnecessary to store all of the elements of those lists unless you intend to return the entire lists. (2) Argument `Y` is unused and variable `Y_RESPONSE3` is undefined. Are these the same object? — Mikael Jagan, Nov 12 '21 at 18:17
Thank you for picking that up, I hadn't noticed it. With regards to the lists, yes I need the entire lists because they contain functional covariates. I'm not sure if you're familiar with functional data analysis, but the data is stored in matrices and these matrices are the elements in that list. — stat_math123, Nov 12 '21 at 18:42

How do get this function to run on multiple cores/processors in R?

0 Answers0