0

I'm working on a Hortonworks Data Platform 2.6 cluster with HBase 1.1.2 and Phoenix 4.7 installed.

I have a huge HBase table with lots of columns, where sometimes new columns are added if new data is added (data is added by HBase API's Put mechanism).

Now I would like to use Phoenix for this table. I found this tutorial, that sais that I have to create a separate Phoenix view based on the HBase table structure: https://khodeprasad.wordpress.com/2016/07/26/how-to-use-existing-hbase-table-in-apache-phoenix/

Based on the tutorial I'd have to list all the column families and columns (! more than 1000 at the moment, and still increasing), that would be a lot of work and the view wouldn't be up-to-date if new columns are added to the HBase table.

Now my questions here are:

  1. Does it make sense to use Phoenix for such huge tables, that also can change over the time?
  2. Is there a way to create something like a "dynamic" Phoenix view that fits the HBase columns automatically?
D. Müller
  • 3,336
  • 4
  • 36
  • 84
  • You may want to look at Read-Only Views https://phoenix.apache.org/views.html. Phoenix is pretty good about massive data sets, but... you should profile your data model with a view. It's the only way to know for certain. If it's just one table, I think it'll perform better than a SELECT/JOIN – Paul Bastide Aug 16 '17 at 10:36

1 Answers1

0

Phoenix works fine with huge tables that change over time. If columns are added later, you can alter the read only view and add the column. The phoenix view will show all of the data from that column retroactively. There is not a way to create a dynamic view, like you could in SQL (i.e. CREATE VIEW view AS SELECT * FROM TABLE) in Phoenix without writing your own Java program to create and keep the columns up to date

Eric Hanson
  • 47
  • 10