0

I'm trying to run a MapReduce job over a large set of preexisting binary files. The files are already there and I can't change their format.

Should I write my own InputFormat for this? How can I make a simple InputFormat that simply returns an InputStream so that I can process the file?

SRobertJames
  • 8,210
  • 14
  • 60
  • 107

1 Answers1

0

I do not think we have a inbuilt InputFormat which ignores splits and feeds the mapper a entire file.

You will need to write your own custom InputFormat. The details for which you can find here

Community
  • 1
  • 1
Sudarshan
  • 8,574
  • 11
  • 52
  • 74