I have an RDD of the following format and would like to convert it into a LabeledPoint RDD in order to process it in mllib :
Test: RDD[(Int, Seq[Double])] = Array((1,List(1.0,3.0,8.0),(2,List(3.0, 3.0,8.0),(1,List(2.0,3.0,7.0),(1,List(5.0,5.0,9.0))
I tried with map
import org.apache.spark.mllib.linalg.{Vector, Vectors}
import org.apache.spark.mllib.regression.LabeledPoint
Test.map(x=> LabeledPoint(x._1, Vectors.sparse(x._2)))
but I get this error
mllib.linalg.Vector cannot be applied to (Seq[scala.Double])
So presumably the Seq element needs to be converted first but I don't know into what.