I am dealing with a computationally intensive package in R. This package has no alternative implementations that interfaces with a Spark cluster;however, it does have an optional argument to take in a cluster created with the parallel package. My question is can I connect to a spark cluster using something like SparklyR, and then use that spark cluster as part of a makeCluster command to pass into my function?
I have successfully gotten the cluster working with parallel, but I do not know how or if it is possible to leverage the spark clusters.
library(bnlearn)
library(parallel)
my_cluster <- makeCluster(3)
...
pc_structure <- pc.stable(train[,-1], cluster = my_cluster)
My question is can I connect to a spark cluster as follows:
sc <- spark_connect(master = "yarn-client", config = config, version = '1.6.2')
and then leverage the connection (the sc object) in the makeCluster() function?