0

Am I able to get entire input split into mapper rather than each line of inputsplit into mapper.

For this I need to implement my own Custom Input format. but if I am writing WholeFileInputFormat

Whether it means the mapper get the entire line or entire input split?

Does NLineInputFormat solves my problem?

Kara
  • 6,115
  • 16
  • 50
  • 57
USB
  • 6,019
  • 15
  • 62
  • 93

1 Answers1

0

I wouldn't bother with NLineInputFormat. You probably don't always know what N is and you don't need the overhead of the input format reading each file to find the line byte offsets.

The WholeFileInputFormat from here (which I assume what you're referencing) will pass the entire file as the value to the map method.

Mike Park
  • 10,845
  • 2
  • 34
  • 50
  • But whether that is the full input file or input split.If it is full input file - How can Hadoop manage the file while using WholeInputFormat as map gets the whole file content and no parallizing will be done also.. – USB Jun 29 '14 at 09:40