5

I have some pairs cw (Integer i, String word) with i number of occurences of word in a text file.

I would like to simply have for each pair a new pair c1 (Integer i, 1) with 1 fixed number.

It seems to be really trivial but I haven't understood how map/mapToPair functions actually work.

JavaPairRDD<Integer, Integer> c1 = cw.map(??? -> new Tuple2<Integer, Integer>(??, 1));

I am working using Java-8.

abaghel
  • 14,783
  • 2
  • 50
  • 66
rugrag
  • 163
  • 1
  • 1
  • 7

3 Answers3

10

If I understand you correctly, you have below JavaPairRDD.

JavaPairRDD<Integer, String> cw = ...;

Now you want to create below JavaPairRDD where second value is 1.

JavaPairRDD<Integer, Integer> c1;

In order to get this, first you have to extract JavaRDD from cw JavaPairRDD and for this you will have to call map function like below. We will extract first value from pair.

JavaRDD<Integer> cw1 = cw.map(tuple -> tuple._1());

Now you will create new JavaPairRDD from JavaRDD using mapToPair function like below.

JavaPairRDD<Integer, Integer> c1 = cw1.mapToPair(i -> new Tuple2<Integer, Integer>(i, 1));

In single line you can write it like

JavaPairRDD<Integer, Integer> c1 = cw.map(tuple -> tuple._1()).mapToPair(i -> new Tuple2<Integer, Integer>(i, 1));
abaghel
  • 14,783
  • 2
  • 50
  • 66
1

This is what you can try:

JavaPairRDD<Integer, Integer> tuples = filtered.mapToPair(
                                            f -> new Tuple2<Integer, Integer>(
                                                       Integer.parseInt(f[0]), 
                                                       Integer.parseInt(f[1])
                                       ));
KayV
  • 12,987
  • 11
  • 98
  • 148
-1

Simply ... cw.mapValues(v -> 1);

From the api docs for JavaPairRDD.mapValues() ...

Pass each value in the key-value pair RDD through a map function without changing the keys; this also retains the original RDD's partitioning.

Brad
  • 15,186
  • 11
  • 60
  • 74
  • Why the mark down? This is a much cleaner solution that answers the question ... _I would like to simply have for each pair a new pair c1 (Integer i, 1) with 1 fixed number_ – Brad May 21 '18 at 15:55