I have two RDDs, say A and B, of the type RDD[Array[Int]]
and want to compute the set difference A - B and B - A. I tried the following code
val R1 = A.subtract(B)
val R2 = B.subtract(A)
but it does not return the correct answer. In a previous answer, it is mentioned that "Performing set operations like subtract with mutable types (Array in this example) is usually unsupported, or at least not recommended." So I have to change the code to
val A1 = A.map(_.to[ArrayBuffer]).persist()
val B1 = B.map(_.to[ArrayBuffer]).persist()
val R1 = A1.subtract(B1)
val R2 = B1.subtract(A1)
Now it returns the correct answer. I want to know if there is any more efficient way to do this.