0

I am really new to Hive, I apologize if there are any misconceptions in my question.

I need to read a hadoop Sequence File into a Hive table, the sequence file is thrift binary data, which could be deserialized using SerDe2 that comes with Hive.

The problem now is: One column in the file is encoded with Google protobuf, so when thrift SerDe processes the sequence file it does not process the protobuf encoded column properly.

I wonder if there's a way in Hive to deal with this kind of protobuf encoded columns that are nested inside a thrift sequence file, so that each column could be parsed properly?

Thank you so much for any possible help!

dtolnay
  • 9,621
  • 5
  • 41
  • 62
emiaozang
  • 1
  • 1

1 Answers1

0

I believe you should use some other serde to deserialize the proto buff format,

may be you can refer this,

https://github.com/twitter/elephant-bird/wiki/How-to-use-Elephant-Bird-with-Hive

Sathiyan S
  • 1,013
  • 6
  • 13