Hbase reading freezes for a few records when reading with partial rowkey

Asked Jan 02 '20 at 00:27

Active Jan 02 '20 at 08:44

Viewed 53 times

I am reading data from HBase through spark. The code runs fine when reading the data using a prefix filter with a complete rowkey or using GET, but it freezes if I use a partial prefixed rowkey. The rowkey structure is md5OfAkey_Akey_txDate_someKey. I want to read all data matching “Akeys” from a data frame. The table has a single column family , 50 column qualifiers and has around 200 million records. So when I read using md5OfAkey_Akey_txDate the code gets stuck while if I construct the whole key it runs fine. But I do not want to pass the whole rowkey as I want to read all data for a particular account(Akey) and transaction date (txDate). Any help would be appreciated.

edited Jan 02 '20 at 08:44

Nikhil Suthar

2,289
1
6
24

asked Jan 02 '20 at 00:27

Shaggy1755

Performing a scan by partial rowkey (i.e. using PrefixFilter) is expected to be slower than direct `get`. Can you quantify "stuck" or does it never return? – mazaneicha Jan 03 '20 at 19:49
sorry for the late reply. I went ahead with the multirowrange filter in hbase and the code runs much faster than the prefix or the fuzzy filter. I am still not sure why the prefix filter was taking more than 10 minutes to get data for a single partial rowkey whereas the multirowrange filter brings the same data in seconds. – Shaggy1755 May 08 '20 at 16:40

Hbase reading freezes for a few records when reading with partial rowkey

0 Answers0