0

I am new to Hive, got some stuff to parse logs of the format

[Time Stamp] {Complex JSON data}

As I see from my searches so far, There are JSON Serde's available.

Can I extend those JSON Serde code to suit my need ? If so which JSON serde code would be better to choose ?

If this approach is not good, Any other pointers?

Thanks

dtolnay
  • 9,621
  • 5
  • 41
  • 62
veera
  • 69
  • 8

1 Answers1

0

Instead of using any other open source serde,

I found writing a serde myself was much simpler. Apart from the boiler plate code, I just had to write my business logic in deserialize method, that worked like a charm.

This link was very helpful. http://blog.cloudera.com/blog/2012/12/how-to-use-a-serde-in-apache-hive/

Also, I tried with UDTF, that too worked smoothly, found that serde was much faster.

Hope this helps someone

veera
  • 69
  • 8