I need a function on RDD, let's say 'isAllMatched' which will take a predicate as an argument to match. However, I don't want to scan all elements, if predicate fails for any element, it should return false. I also want this function to execute parallely on all worker nodes. Here is the pseudocode:
def isAllMatched[T : ClassTag](rdd: RDD[T])(pred: T => Boolean) = {
foreach(ele <- rdd.elements) {
if(!pred(ele)) return false;
}
return true;
}
Is this possible in Spark ? Is there any built-in function to do that ?