0

I am writing a basic hadoop word count function in java and need the output to be formatted as (k: v) rather than the default (k '\t' v). So far I've only found ways to format the input delimiter using KeyValueTextInputFormat (which is deprecated) and there does not seem to be a corresponding version for output format. Is there a simple way to do this?

Vince
  • 3
  • 2
  • What are you trying to format exactly, please elaborate. – rVr Mar 18 '14 at 05:39
  • I am trying to format the (k,v) pairs in the hdfs output. For example running prints a list as (k '\t' v) but instead I want a list as (k: v). I also edited the question. I didn't realize that inequalities in the post act as comments. – Vince Mar 18 '14 at 14:56

1 Answers1

0

It can be achieved by setting this parameter mapred.textoutputformat.separator in configuration of job to desired delimiter. In your case it will be something like conf.set("mapred.textoutputformat.separator", ":"); . Depending on version of hadoop and distribution the parameter name could be different.

rVr
  • 1,331
  • 8
  • 11
  • I found more detailed answer http://stackoverflow.com/questions/11031785/hadoop-key-and-value-are-tab-separated-in-the-output-file-how-to-do-it-semicolon – rVr Mar 18 '14 at 15:54