1

In the below code I am trying read a text file as an rdd and I am calling the map method because I want to transpose each line and append it to the String Builder object. But I want to return the String Builder object after I have finished with each line . But here I am returning it at each line . So when I do a exposuresRdd.saveAsTextFile().

I am getting the output as (repeats) a b

a b c

a b c d

where as I want it to be a b c d e f

It should not repeat

JavaRDD<String> exposuresRdd = ctx.textFile(fname);

JavaRDD<String> transformedrdd= exposuresRdd.flatMap(new Function<String, String>() {

        @Override
        public String call(String line) throws Exception {
sb.append(Something);
return sb.toString();

});
}
Syed Ammar Mustafa
  • 373
  • 1
  • 7
  • 18
  • I'm not understanding the question here - if you want to return the string at the end of each line, then the output will be a repeat of previous lines + the new line. Also surely you would be saving transformedRdd and not exposuresRdd.saveAsTextFile()? – Gillespie Sep 28 '15 at 14:56
  • Please [a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve). It is not clear what you want and your code is incomplete. It looks like you make exactly the same type of mistake as [here](http://stackoverflow.com/q/32798554/1560062) – zero323 Sep 28 '15 at 15:45

1 Answers1

0

First of all at the end I would save transformedRdd.saveAstextFile() and not exposuresRdd as Gillespie says.

I was able to resolve the issue of the data repeating by using a new String Builder object when ever returning the string .

As using the same String Builder object would already contain the previous lines appended to it , I was getting the repeated data in my final output.

JavaRDD<String> exposuresRdd = ctx.textFile(fname);

StringBuilder sb = null;
JavaRDD<String> transformedrdd= exposuresRdd.flatMap(new Function<String,String>() {

    @Override
    public String call(String line) throws Exception {
sb = new StringBuilder(); 
sb.append(Something);
return sb.toString();

 });
}

This will make sure that every time you return the StringBuilder object it will have only the data appended to it at that particular call. The output now would be - > a b c d e f
If we use the same StringBuilder object for all the calls , the output would have been -> a
ab abc abcd abcde abcdef
(because of the data appended to the StringBuilder object in the previous calls.)

Syed Ammar Mustafa
  • 373
  • 1
  • 7
  • 18