I have a JavaRDD<Tuple2<String, String>>
and need to transform it to JavaPairRDD<String, String>
. Currently I am doing it by simply writing map function that just returns the input tuple as is. But I wonder if there is a better way?
Asked
Active
Viewed 2.0k times
13

YuliaSh.
- 795
- 1
- 6
- 23
-
Ok, so there is no better way in Java, right? – YuliaSh. Nov 19 '14 at 18:37
-
2`new JavaPairRDD(javaRdd)` ? – maasg Nov 19 '14 at 18:59
-
Might be.. I also finally found PairJavaRDD.fromJavaRDD(rdd) – YuliaSh. Nov 19 '14 at 19:15
-
if you are the one constructing the `JavaRDD
>` e.g., from a map transformation of a JavaPairRDD, you could instead call mapToPair and avoid having a JavaRDD in the first place. – vefthym Feb 21 '17 at 13:30
5 Answers
14
JavaPairRDD.fromJavaRDD(rdd) is one of solutions

YuliaSh.
- 795
- 1
- 6
- 23
-
JavaRDD
buildingRDD = jsc.sparkContext().parallelize(listSmartBuilding); I want to iterate over this JavaRDD, could you help me. SmartBuildingNew is a POJO class.jsc is the JavaStreamingContext object – Anshul Kalra Feb 11 '16 at 02:59
4
For reverse conversion, this seems to work:
JavaRDD.fromRDD(JavaPairRDD.toRDD(rdd), rdd.classTag());

Michal Čizmazia
- 875
- 1
- 8
- 14
2
Try this example:
JavaRDD<Tuple2<Integer, String>> mutate = mutateFunction(rdd_world); //goes to a method that generates the RDD with a Tuple2 from a rdd_world RDD
JavaPairRDD<Integer, String> pairs = JavaPairRDD.fromJavaRDD(mutate);

3xCh1_23
- 1,491
- 1
- 20
- 39
2
Try this to transform JavaRDD into JavaPairRDD. For me It is working perfectly.
JavaRDD<Sensor> sensorRdd = lines.map(new SensorData()).cache();
// transform data into javaPairRdd
JavaPairRDD<Integer, Sensor> deviceRdd = sensorRdd.mapToPair(new PairFunction<Sensor, Integer, Sensor>() {
public Tuple2<Integer, Sensor> call(Sensor sensor) throws Exception {
Tuple2<Integer, Sensor> tuple = new Tuple2<Integer, Sensor>(Integer.parseInt(sensor.getsId().trim()), sensor);
return tuple;
}
});

Maciej Dobrowolski
- 11,561
- 5
- 45
- 67

Rajeev Rathor
- 1,830
- 25
- 20
1
Alternatively you can call mapToPair(..)
on your instance of org.apache.spark.api.java.JavaRDD
.

preeze
- 1,061
- 1
- 12
- 18