1

I'm using Spark 1.3.0 (Scala 2.10.X) MLlib LDA algorithm with Spark Java API. I have the following issue when I try to read the document-topic distribution from LDA model during runtime.

"main" java.lang.ClassCastException: [Lscala.Tuple2; cannot be cast to scala.Tuple2

I have given the relevant code below:

DistributedLDAModel ldaModel = new LDA().setK(3).run(corpus);
RDD<Tuple2<Object, Vector>> topicDist = ldaModel.topicDistributions();

How do I read or display the content (documents and their topic distribution) in "topicDist" in JavaRDD?

zero323
  • 322,348
  • 103
  • 959
  • 935
Jay
  • 63
  • 8

1 Answers1

0

I found the solution and I have given it below:

JavaRDD<Tuple2<Object, Vector>> topicDist = ldaModel.topicDistributions().toJavaRDD();

List<Tuple2<Object, Vector>> list = topicDist.collect();
zero323
  • 322,348
  • 103
  • 959
  • 935
Jay
  • 63
  • 8