26

Can anybody tell me, how to list all row keys in an hbase table?

hbase_user
  • 529
  • 4
  • 9
  • 16
  • 1
    Do you want to list all row keys through the hbase shell or through the Java API? – knt Mar 07 '11 at 17:46
  • Hi knt ,i need to list using REST-PHP combination. Can u help me? Thanks in advance. – hbase_user Mar 15 '11 at 04:48
  • Look here http://hbase.apache.org/docs/r0.20.4/api/org/apache/hadoop/hbase/stargate/package-summary.html#operation_scanner_create – dminer Feb 16 '12 at 21:11

6 Answers6

31

The HBase shell could be used to list all the row keys:

count 'table_name', { INTERVAL => 1 }
030
  • 10,842
  • 12
  • 78
  • 123
Haimei
  • 12,577
  • 3
  • 50
  • 36
15

This should be considerably faster (the FirstKeyOnlyFilter is run on the server and strips all the column data before sending the result to the client):

Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, tableName.getBytes());
System.out.println("scanning full table:");
Scan scan = new Scan();
scan.setFilter(new FirstKeyOnlyFilter());
ResultScanner scanner = table.getScanner(scan);
for (Result rr : scanner) {
  byte[] key == rr.getRow();
  ...
}
Eric Walker
  • 7,063
  • 3
  • 35
  • 38
Geli
  • 503
  • 1
  • 4
  • 8
  • 1
    Correct, but `scan.setCaching(1000)` is far **more crucial** than just using `FirstKeyOnlyFilter`. Please note that default caching is set to `1`. – G. Demecki Feb 20 '15 at 14:03
5
Configuration conf = HBaseConfiguration.create();
HTable table = new HTable(conf, tableName.getBytes());

System.out.println("scanning full table:");
ResultScanner scanner = table.getScanner(new Scan());
for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
  byte[] key == rr.getRow();
  ...
}
David
  • 3,251
  • 18
  • 28
2

When performing a table scan where only the row keys are needed (no families, qualifiers, values or timestamps), add a FilterList with a MUST_PASS_ALL operator to the scanner using setFilter. The filter list should include both a FirstKeyOnlyFilter and a KeyOnlyFilter. Using this filter combination will result in a worst case scenario of a RegionServer reading a single value from disk and minimal network traffic to the client for a single row.

Patruni Srikanth
  • 741
  • 1
  • 7
  • 14
2

Use the getRow method of Result class. Its description says:

Method for retrieving the row key that corresponds to the row from which this Result was created.

Assuming table is your hbase table and you are connected to your HBase instance, all you need to do is:

Scan scan = new Scan();
ResultScanner rscanner = table.getScanner(scan);
for(Result r : rscanner){
   //r is the result object that contains the row
   //do something
   System.out.println(Bytes.toString(r.getRow())); //doing something
}

I understand that this has already been answered from Java API point of view but a little more detail never hurt anyone.

Nikhil Vandanapu
  • 489
  • 10
  • 18
1

It seems that you want to use HBase thrift client in PHP. Here is a sample code and you can get all data in HBase and get their row keys.

<? $_SERVER['PHP_ROOT'] = realpath(dirname(__FILE__).'/..');
   require_once $_SERVER['PHP_ROOT'].'/flib/__flib.php';
   flib_init(FLIB_CONTEXT_SCRIPT);
   require_module('storage/hbase');
   $hbase = new HBase('<server_name_running_thrift_server>', <port on which thrift server is running>);
   $hbase->open();
   $client = $hbase->getClient();
   $result = $client->scannerOpenWithFilterString('table_name', "(PrefixFilter ('row2') AND (QualifierFilter (>=, 'binary:xyz'))) AND (TimestampsFilter ( 123, 456))");
   $to_print = $client->scannerGetList($result,1);
   while ($to_print) {
      print_r($to_print);
      $to_print = $client->scannerGetList($result,1);
    }
   $client->scannerClose($result);
?>
tobe
  • 1,671
  • 2
  • 23
  • 38