I set the property mapred.textoutputformat.separator
with value \001
. But when I run the MR Job, it's throwing exception:
Character reference "" is an invalid XML character.
Please help me.
I set the property mapred.textoutputformat.separator
with value \001
. But when I run the MR Job, it's throwing exception:
Character reference "" is an invalid XML character.
Please help me.
I got the solution. The reason was that when using "\001" character sequence or other Unicode characters, during the object serialization it was getting transformed to some invalid formats.
So the solution was to encode the character using Base64, override the getRecordWriter method of TextOutputFormat class and then decode it there.(Base64.decodeBase64)
This will work.