1

It seems that the map.input.start property isn't giving me the position of the start of a line (except, of course, the first map.input.start which is 0). Sometimes, map.input.start is somewhere in the middle of the first line of the mapper's input, sometimes it's somewhere in the middle of the last line of the previous mapper's input. Is this to be expected? If so, how can I get byte offsets of lines? using TextInputFormat doesn't work, because I'm using Hadoop streaming, which discards the key to the mapper

Vyassa Baratham
  • 1,457
  • 12
  • 18

0 Answers0