Pagination for Reading Titan Vertex from HBase

Question

I am currently working on creating a Java code that can read Titan Vertex From from Hadoop HBase backend. I know blueprint api provides a getVertices() method on every TransactionalGraph, but still I am trying implement my own method. Now for usual vertex reading I already a working code that can read the entire HBase backend and fetch all the vertices from Titan Graph, but I am having a problem in implementing Pagination.

My Code so far :

    Scan scan = new Scan();
    Filter pageFilter = new ColumnPaginationFilter(DEFAULT_PAGE_SIZE, currentOffSet);
    scan.setFilter(pageFilter);
    scan.addFamily(Backend.EDGESTORE_NAME.getBytes());
    scan.setMaxVersions(10);
    List<Vertex> vertexList = new ArrayList<>(DEFAULT_PAGE_SIZE);
    HTablePool pool = new HTablePool(config, DEFAULT_PAGE_SIZE);
    ResultScanner scanner = pool.getTable(attributeMap.get("storage.tablename")).getScanner(scan);

But the ResultScanner returning the entire Graph.

currentOffSet is a int variable which determines the Current Page Number.

I also tried with ResultScanner#next(int rowCount). It works fine. But in this process I don't have an option to Go Back to Previous Page.

Can any one help me ?

Thank you in advance.

score 0 · Accepted Answer · answered Jul 08 '13 at 05:39

I have solved it. The logic is pretty simple. You have to use the setStartRow method on the scanner instance. For the first time it is not necessary because the scanning should be started from the very first row. Then we need to fetch *(PAGE_SIZE+1)* numbers of rows. The last row from the ResultScanner will be used as the starting row for the next Page.

For going back to Previous page we need to use a buffer or stack that will store the starting row for all the previously visited page.

Here is my Code snippet :

    Scan scan = (new Scan()).addFamily(Backend.EDGESTORE_NAME.getBytes());
    Filter filter = new PageFilter(DEFAULT_PAGE_SIZE + 1);
    scan.setFilter(filter);
    if (currentPageStartRowForHBase != null) {
        scan.setStartRow(currentPageStartRowForHBase);
    }
    List<Vertex> vertexList = new ArrayList<>(DEFAULT_PAGE_SIZE + 1);
    HTablePool pool = null;
    ResultScanner scanner = null;
    try {
        if (pool == null) {
            pool = new HTablePool(config, DEFAULT_PAGE_SIZE + 1);

        }
        scanner = pool.getTable(attributeMap.get("storage.tablename")).getScanner(scan);
        for (Result result : scanner) {
            ByteBuffer byteBuffer = ByteBuffer.wrap(result.getRow());
            Vertex vertex = this.getVertex(IDHandler.getKeyID(byteBuffer));
            if (vertexList.size() < DEFAULT_PAGE_SIZE)
                vertexList.add(vertex);
            else {
                nextPageStartRowForHBase = byteBuffer.array();
            }
        }
    } catch (Exception e) {
        e.printStackTrace();
    }

nextPageStartRowForHBase & currentPageStartRowForHBase are byte[].

This fulfilled my requirement. But If anyone have any better solution please share with us.

I just started evaluating Titan and my understanding was that once cannot directly read/write Titan graph data from Hbase. It looks like you are saying it is possible. Could you point me to examples that can show me how to do this? — chapstick, Nov 18 '13 at 22:42
The above code is the simplest example of what you are asking. The List vertexList is the Collection that I used to hold the Vertices. `IDHandler.getKeyID(byteBuffer)` This line actually returns the vertex id from the HBase backend. Now once I get the vertex id, It's not much complicated to get The Vertex Instance. What I needed was a simple way to read Data from HBase Backend and Cassandra because Titan do not support Global Query. If you want a better way to read data from HBase or Cassandra you can definitely do it by exploring the Back-end functionality. — Pradatta, Nov 21 '13 at 10:39

Pagination for Reading Titan Vertex from HBase

1 Answers1