I am new to this hadoop environment and I want to write a custom SerDe for EBCDIC files. I searched a lot on the internet but didn't get any material about SerDe development. If you have any idea about SerDe development please post the links. Thanks in advance.
Asked
Active
Viewed 732 times
-1
-
Hi welcome to stack overflow, this question may be considered off-topic here, try to reformulate your question with details on what have you tried or found so far (Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it. http://stackoverflow.com/help/on-topic). – terence hill Dec 29 '15 at 10:48
-
Saying it is EBCDIC file, is not very useful EBCDIC is a character-set (actually a family of character-sets) and can be be read just like any ascii / utf-8 file in java etc. What will be more difficult is if it is a Mainframe File, you could have FB / VB files. Does the file have binary / COBOL-Zoned decimal fields, Is there a Cobol copybook ??? – Bruce Martin Dec 29 '15 at 20:12
1 Answers
0
Start with official Hive apache wikipage for SerDe, see the source code of Hive build-in implementations and based on that just try to write your own. Moreover I disbelieve that you are not able to google any additional blog post or tutorial touching this topic.

marbu
- 1,939
- 2
- 16
- 30
-
Hello marbu, Thanks for the link. I googled a lot and found the documentation on the existing serde's. There was a very few and incomplete data on custom serde development. It will be very helpful if you can share the tutorial or any blog that has explained the serde development process from the beginning. Thanks again. – Rohit Khirid Dec 29 '15 at 14:35
-
When the goal is to reimplement part of well established system to achieve new special feature, chances are that there is no single document which would describe everything from scratch. You can't expect to find single tutorial to implement new SerDe for someone who desn't know Hive internals at all. So the best approach is to start learning about SerDe from official docs and other implementations (it's open source, yay!) and only after that you can expect to understand documents about custom serde development. – marbu Dec 29 '15 at 15:36