1

I have a VectorWritable (org.apache.mahout.math.VectorWritable) which is coming from a sequence file generated by Mahout and I would like to convert that into Vector (org.apache.spark.mllib.linalg.Vectors) type is Spark. How can I do that in Scala?

zero323
  • 322,348
  • 103
  • 959
  • 935
HHH
  • 6,085
  • 20
  • 92
  • 164

1 Answers1

1

Assuming we haveRDD[(Text, VectorWritable)] from your previous question.

import scala.collection.JavaConverters.iterableAsScalaIterableConverter

def mahoutToScala(v: org.apache.mahout.math.VectorWritable) =  {
    val scalaArray = v.get.all.asScala.map(_.get).toArray
    org.apache.spark.mllib.linalg.Vectors.dense(scalaArray)
}

rdd.map{ case (k, v) => (k.toString, mahoutToScala(v))}
Community
  • 1
  • 1
zero323
  • 322,348
  • 103
  • 959
  • 935