2

I would like to perform a rows query with Happybase for some known row keys and add a value filter so that only rows matching the filter are returned.

In HBase shell you can supply a filter to a get command, like so:

get 'meta', 'someuser', {FILTER => "SingleColumnValueFilter ('cf','gender',=,'regexstring:^male$')"}

In Happybase you can add a filter to a scan command but I don't see the option on a rows query. Here is how it works for scan:

rows = tab.scan(filter="SingleColumnValueFilter('cf','gender',=,'regexstring:^male$')")

Is there a way to perform a filtered rows query (for potentially random ordered row keys) using Happybase (or any other Python HBase client library)?

I imagined it would look like this (but there is no filter argument):

rows = tab.rows(rows=['h_key', 'a_key', 'z_key'], filter="SingleColumnValueFilter('cf','gender',=,'regexstring:^male$')")
dsimmie
  • 189
  • 1
  • 2
  • 14

1 Answers1

0

Get with filter is equal to Scan with start/stop row.

rows = tab.scan(filter="SingleColumnValueFilter('cf','gender',=,'regexstring:^male$')",
                     row_start="someuser", row_stop="someuser")

In Java, a FilterList combined MultiRowRangeFilter and SingleColumnValueFilter will perfectly satisfy your demand, and there is an example about that.

However, as happyhbase use Hbase Thrift service, and it seems that don't support FilterList, so I think the best you can get is to call the above procedure for each key in your example.

sel-fish
  • 4,308
  • 2
  • 20
  • 39
  • On reflection I left something out of my question, I will edit it. The query I want to perform is a `rows` query over a randomly ordered set of row keys. I cannot use a row key range as the order is not known. – dsimmie Aug 30 '16 at 08:34
  • @dsimmie update the answer and i don't think there be a perfect solution with current ```happybase``` version. – sel-fish Aug 31 '16 at 03:20