I am writing a basic hadoop word count function in java and need the output to be formatted as (k: v) rather than the default (k '\t' v). So far I've only found ways to format the input delimiter using KeyValueTextInputFormat (which is deprecated) and there does not seem to be a corresponding version for output format. Is there a simple way to do this?
Asked
Active
Viewed 242 times
0
-
What are you trying to format exactly, please elaborate. – rVr Mar 18 '14 at 05:39
-
I am trying to format the (k,v) pairs in the hdfs output. For example running
prints a list as (k '\t' v) but instead I want a list as (k: v). I also edited the question. I didn't realize that inequalities in the post act as comments. – Vince Mar 18 '14 at 14:56
1 Answers
0
It can be achieved by setting this parameter mapred.textoutputformat.separator
in configuration
of job to desired delimiter. In your case it will be something like conf.set("mapred.textoutputformat.separator", ":");
. Depending on version of hadoop and distribution the parameter name could be different.

rVr
- 1,331
- 8
- 11
-
I found more detailed answer http://stackoverflow.com/questions/11031785/hadoop-key-and-value-are-tab-separated-in-the-output-file-how-to-do-it-semicolon – rVr Mar 18 '14 at 15:54