In the code below, I get a dense Matrix V after doing SVD. What I want is
- Given a set of values(say 3,7,9).
- I want to extract the 3,7 and 9th row of Matrix V.
- I want to calculate cosine similarity of these 3 rows with each row of Matrix V
- I need to add the three cosine similarities obtained for of each row.
- I finally need the index of row which have the maximum summation.
val data = Array( Vectors.sparse(5, Seq((1, 1.0), (3, 7.0))), Vectors.dense(2.0, 0.0, 3.0, 4.0, 5.0), Vectors.dense(4.0, 0.0, 0.0, 6.0, 7.0)) val dataRDD = sc.parallelize(data) val mat: RowMatrix = new RowMatrix(dataRDD) // Compute the top 4 singular values and corresponding singular vectors. val svd: SingularValueDecomposition[RowMatrix, Matrix] = mat.computeSVD(4, computeU = true) val U: RowMatrix = svd.U // The U factor is a RowMatrix. val s: Vector = svd.s // The singular values are stored in a local dense vector. val V: Matrix = svd.V // The V factor is a local dense matrix.
Please advise an efficient method to do the same. I have been thinking of converting Matrix V to Indexed Row Matrix, But when I do use row iterator on V, How do I keep track of index of rows? Is there a better way to do it?