0

How can I see different version of Hbase data in Hive. As per my understanding using HbaseStorageHandler only latest version of Hbase data will be available in Hive .Is my understanding correct/updated?

Is there any way to access different version of Hbase data using Hive??

Thanks in advance :)

(New to Hbase-Hive Integration)

dijin
  • 61
  • 6

2 Answers2

1

That would depend on the version of hive that you are using. Prior to hive 1.1, hbase timestamps were not accessible through the hive-hbase integration [1] (Related: [2]). So the answer being, You require hive 1.1 or higher. Hope it helps.

[1] https://issues.apache.org/jira/browse/HIVE-2828

[2] https://issues.apache.org/jira/browse/HIVE-8267

Shyam
  • 516
  • 3
  • 7
  • Thanks Shyam ...we are using Hive 0.14....I went through the Patch ...how can I use it in my system....could you please share the steps to absorb the patch in my system ...Thanks in advance :) – dijin Feb 07 '16 at 20:58
  • That would mean to 'back port' the patch to your `hive` version, 0.14. It requires some knowledge and tinkering of the `hive` codebase and you would need to build the `hive` yourself and use the patched 'jar(s)'/distribution. (It's not very hard if you are familiar with hadoop/java, but it'll require some effort) – Shyam Feb 08 '16 at 00:28
  • thanks Shyam for the updates ...I am currently using Hortonworks in my cluster...Is there any patch from them available. – dijin Feb 09 '16 at 10:01
  • Not that I'm aware of. Unless it's a pressing issue or a important feature, The patches are not back-ported usually. I think the best I can suggest you is to find it from [Hortonworks community](http://hortonworks.com/community/) itself. Sorry if that is not much of help. – Shyam Feb 11 '16 at 01:55
0

Not 100% answer but directions. In normal life HBase is always about special cases.

Here is slightly outdated but really simple article to understand approach: http://hortonworks.com/blog/hbase-via-hive-part-1/

So practically you can implement any InputFormat or OutputFormat you need. But this is related to MapReduce gears.

In principle Spark can always rely on InputFormat too so the question is only about your special case.

Another good idea is depicted here: http://www.slideshare.net/HBaseCon/ecosystem-session-3a So snapshots could help to take state of tables you really need and then you are free to use any gear to connect Hive with HBase if it follow standards.

In general basic idea is to tune gears which connects Hive to your HBase data so they will apply needed version filters to you. This does not depend so much on versions as this interface is pretty stable.

Hope this will help you.

Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110