Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
1 answer

hbase: assign CompareFilter.CompareOp based on input value

Can we take the compareOperator value based on input value? For example if my input is eq, then it should pick CompareFilter.CompareOp.EQUAL, else if input is ne it should pick CompareFilter.CompareOp.NOT_EQUAL. Something like…
2
votes
1 answer

How to write custom database adapter for django using JDBC drivers?

I have a web-app in Django and backend in Hbase. To access hbase I'm using Apache Phoenix to query hbase. Phoenix has jdbc drivers exposed. How can I integrate Phoenix with Django ORM using these jdbc drivers? Can I write customer db adapter or is…
SaurabhR
  • 129
  • 3
  • 10
2
votes
1 answer

Get Column Names in HBase Thrift C++?

I need to get the list of column qualifiers available in a HBase table. Suppose I have a table 'Customers' with column 'info:Age' and 'contact:PhoneNo'. To get the list of column families I see there's a method 'getColumnDescriptors' which returns…
Edwin Vivek N
  • 564
  • 8
  • 28
2
votes
0 answers

Hbase Master is failing in a cluster set up. (Hadoop, Hbase, and Zookeeper)

I am running a single node cluster with hadoop, hbase and zookeeper. Hmaster is failing to construct with following error. Can anyone help me? ** 2016-06-02 14:51:56,770 INFO [master/localhost/127.0.0.1:60000] zookeeper.ZooKeeper: Initiating client…
2
votes
2 answers

Parallel querying HBase for List of row keys using MapReduce

I want to perform query operation in HBase to fetch records using provided list of row keys. Since Mappers in MapReduce work in parallel, so I want to use it. Input List of row keys will be in the range of ~100000 and I have created a…
hp36
  • 269
  • 1
  • 6
  • 20
2
votes
1 answer

Recovering from HBase server failure using Async HBase client

I'm currently trying to find a way to deal with unexpected HBase failures in my application. More specifically, what I'm trying to solve is a case where my application inserts data to HBase and then HBase fails and restarts. In order to check how…
Gideon
  • 2,211
  • 5
  • 29
  • 47
2
votes
1 answer

how to connect to Hbase managed zookeeper

I created a Hbase test ENV using Pseudo-Distributed mode, and I don't setup an independent zookeeper, I'm wondering how could I connect to zookeeper managed by hbase? I could not find zkCli.sh in hbase installation folder. Thanks a lot.
mailme365
  • 511
  • 2
  • 9
  • 20
2
votes
1 answer

SaveAsHadoopDataset never closes connection To zookeeper

I am using the below code to write to hbase jsonDStream.foreachRDD(new Function, Void>() { @Override public Void call(JavaRDD rdd) throws Exception { DataFrame jsonFrame =…
Amit_Hora
  • 716
  • 1
  • 8
  • 27
2
votes
0 answers

HBase: execute small job using cluster

I have a Java function that runs on a single HBase row (a Result), it takes a Result as an input and outputs a byte[]. I would like to run this function on 10K-100K HBase rows and collect the results. I have a List which is the rows I'd like…
ytoledano
  • 3,003
  • 2
  • 24
  • 39
2
votes
1 answer

Spark: Partitioning an RDD created from HBase data

If I read some data from an HBase (or MapR-DB) table with JavaPairRDD usersRDD = sc.newAPIHadoopRDD(hbaseConf, TableInputFormat.class, ImmutableBytesWritable.class, Result.class); the resulting RDD has 1 partition,…
Skice
  • 461
  • 5
  • 18
2
votes
0 answers

Nutch2.3.1 hangs while inject, parse fetch, generate

I've read various SO threads on why it takes so long (or hangs) while generating/injecting/parsing/fetching, but to no luck. The solutions in the following SO threads I've tried implementing, but no luck. 1) Nutch 2.1 urls injection takes forever 2)…
Praful Bagai
  • 16,684
  • 50
  • 136
  • 267
2
votes
2 answers

Store image in Hbase loss of Meta data and Exif

Uploading an image to hbase using Java program, after retrieving the image I found there is difference in file size eventually increased and most of Exif and Meta data loss (GPS location data, camera details, etc..) Code : public ArrayList
2
votes
4 answers

getting inconsistent row count from phoenix for HBase

We are facing weird issue with Phoenix & HBase. We have MR program that loads data in HBAse table . We use Phoenix for inserting and reading data from HBase. The issue is after data is loaded the count for particular table matches with what we got…
2
votes
0 answers

HBase java custom filter error DeserializationException ClassNotFoundException

I extended the FilterBase in order to create my custom filter MyCustomFilter. I compiled it, uploaded the .jar to the server where run HBase, restarted it and when I add this class to the filters (I have only this filter) ArrayList testList…
bpdin
  • 101
  • 1
  • 10
2
votes
1 answer

Spark insert to Hbase

I have a pojo class of emp like below: I am able to read streaming data and I want to insert data to Hbase @JsonInclude(Include.NON_NULL) public class empData implements Serializable { private String id; private String name; @Override …
Amaresh
  • 3,231
  • 7
  • 37
  • 60