1

I have Hbase table wiht rowKeys as such (delimter = '#')

0CE5C485#1481400000#A#B#C#T
00C6F485#1481600000#F#J#C#G
065ED485#1481500000#T#X#C#G
...
...

The first part is actually the hex of the timestamp reversed (the second part is the timestamp). I had this rowkey format so that I can split the key into different regions evenly. My regions have splits based on the first two characters of the rowKey ('00','01',...,'FE','FF'). 256 in total

Is there a way to get all rows between two timestamps without overriding the timestamp in the value?

I tried RegexComparators on top of Row Filters
e.g.
FilterList f = new FilterList(FilterList.Operator.MUST_PASS_ALL)
Filter f1 = new RowFilter(CompareFilter.CompareOp.GREATER_OR_EQUAL,new RegexComparator(".*1481400000")
Filter f2 = new RowFilter(CompareFilter.CompareOp.LESS_OR_EQUAL,new RegexComparator(".*1481600000")

f.add(f1)
f.add(f2)

And it gave me wrong results. I tried using SubStringFilter just like above but that also failed in giving me the correct results.

The above is only an example I wrote for the question but I hope you understand the problem I have at hand.

I want to use the same key structure and achieve what I want. Is that even possible?

Huga
  • 571
  • 1
  • 8
  • 21
  • have you tried `public Scan setTimeRange(long minStamp, long maxStamp) throws IOException` ? AFAIK, the above mentioned way is not suitable to Range scans. – Ram Ghadiyaram Dec 11 '16 at 08:23

2 Answers2

2

I'd suggest Time range filter.

import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.*;
import org.apache.hadoop.hbase.util.Bytes;
import java.io.IOException;

public class test {
    public static void main (String[] args) throws IOException {
        HTable table = new HTable(HBaseConfiguration.create(), "t1");
        Scan s = new Scan();
        s.setMaxVersions(1);
// you can use time range filter sfor 
        s.setTimeRange (1481400000L, 1481600000L);
        ResultScanner scanner = table.getScanner(s);
        for (Result rr = scanner.next(); rr != null; rr = scanner.next()) {
            System.out.println(Bytes.toString(rr.getRow()) + " => " +
                    Bytes.toString(rr.getValue(Bytes.toBytes("f1"), Bytes.toBytes("a"))));
        }
    }
}
Ram Ghadiyaram
  • 28,239
  • 13
  • 95
  • 121
0

Scan.setTimeRange() is for filtering VERSIONS of columns/cells within the time range. It has nothing to do with the row key filtering. See https://javadoc.io/doc/org.apache.hbase/hbase-client/1.0.0/org/apache/hadoop/hbase/client/Scan.html#setTimeRange(long,%20long)

Row keys are lexicographically sorted so I believe the HEX code should second field the row key. Then you could just use partial-key scan api which is far faster than filters. E.g.

Scan.setStartRow(Bytes.getBytes("1481400000"));
Scan.setStopRow(Bytes.getBytes("1481500000"));
vvvvv
  • 25,404
  • 19
  • 49
  • 81