2

What is the best way to convert a PairRDD into an RDD with both K and V are merged (in java)?

For example, the PairRDD contains K as some string and V as a JSON. I want to add this K to the value JSON and produce an RDD.

Input PairRDD

("abc", {"x:"100", "y":"200"})
("def", {"x":"400", "y":"500")

Output should be and RDD as follows

({"x:"100", "y":"200","z":"abc"})
({"x":"400", "y":"500","z":"def"})
Manikandan Kannan
  • 8,684
  • 15
  • 44
  • 65

1 Answers1

0

You can use map to translate between the two consider:

scala> pairrdd.foreach(println)
(def,Map(x -> 400, y -> 500))
(abc,Map(x -> 100, y -> 200))

(I think that's what your sample is meant to represent)

scala> val newrdd = prdd.map(X=> X._2 ++ Map("z"-> X._1))
scala> newrdd.foreach(println)
Map(x -> 100, y -> 200, z -> abc)
Map(x -> 400, y -> 500, z -> def)

You'll have to change the val newrdd to java syntax, but the right side of the equation (I believe) will stay the same

James Tobin
  • 3,070
  • 19
  • 35