I got this "JavaPairRDD<HashSet<String>, HashMap<String, Double>>
" RDD after some complicated aggregations, want to save the result to file. I believe saveAsHadoopFile
is a good API to do so, but am having trouble filling in the parameters for saveAsHadoopFile(path, keyClass, valueClass, outputFormatClass, CompressionCodec)
. Can anyone help?
Asked
Active
Viewed 518 times
0

daydayup
- 2,049
- 5
- 22
- 47
1 Answers
0
You can use the following function and later on parse it to the desired result.
rdd.saveAsTextFile ("hdfs:///complete_path_to_hdfs_file/");
but if you want to use saveAsHadoopFile API then following method can be used.
saveAsHadoopFile(complete_path_to_file, HashSet.class, HashMap.class, TextOutputFormat.class)
you can also use HadoopOutputFormat.class
as the last parameter
For more information, you can refer to this link HadoopFile

Devendra Singh
- 640
- 6
- 12
-
How do we write it as an Avro File? I tried `pairRdd
.saveAsHadoopFile("/user/cloudera/avro/", String.class, Float.class, AvroOutputFormat.class);` and got a `NullPointerException` – Amber Jun 27 '18 at 11:26