3

I have complex objects with collection fields which needed to be stored to Hadoop. I don't want to go through whole object tree and explicitly store each field. So I just think about serialization of complex fields and store it as one big piece. And than desirialize it when reading object. So what is the best way to do it? I though about using some kind serilization for that but I hope that Hadoop has means to handle this situation.

Sample object's class to store:

class ComplexClass {

<simple fields>

List<AnotherComplexClassWithCollectionFields> collection;


}
Vladimir
  • 12,753
  • 19
  • 62
  • 77

1 Answers1

6

HBase only deals with byte arrays, so you can serialize your object in any way you see fit.

The standard Hadoop way of serializing objects is to implement the org.apache.hadoop.io.Writable interface. Then you can serialize your object into a byte array using org.apache.hadoop.io.WritableUtils.toByteArray(Writable ... writable).

Also, there are other serialization frameworks that people in the Hadoop community use, like Avro, Protocol Buffers, and Thrift. All have their specific use cases, so do your research. If you're doing something simple, implementing Hadoop's Writable should be good enough.

bajafresh4life
  • 12,491
  • 5
  • 37
  • 46
  • Thanks. How would you convert a byte array back into the original (Writable) object, that is what would the deserialization look like? Preferably using Hadoop's method of serde. – Girish Rao Sep 13 '13 at 13:40
  • @bajafresh4life : Can you please help me on this one. I'm really new to HBase and guide me with easy steps. Thank YOu http://stackoverflow.com/questions/24236547/how-to-store-primitive-datatypes-strings-in-a-hbase-column-and-retrieve-them-u – Chamika Kasun Jun 16 '14 at 04:11