Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
18
votes
1 answer

Timestamp Based Scans in HBase?

For Example for hbase table 'test_table', Values inserted are: Row1 - Val1 => t Row1 - Val2 => t + 3 Row1 - Val3 => t + 5 Row2 - Val1 => t Row2 - Val2 => t + 3 Row2 - Val3 => t + 5 on scan 'test_table' where version = t + 4 should return Row1 -…
Krishna Kalyan
  • 1,672
  • 2
  • 20
  • 43
18
votes
2 answers

hadoop and hbase rebalancing after node additions

I have a fundamental question about load balancer. I just finished adding new nodes to our hadoop(2.3) cluster which also has hbase v0.98. After the addition and having all nodes online in hadoop and hbase, How is hbase affected by hadoop…
user3642189
  • 181
  • 1
  • 1
  • 3
18
votes
10 answers

hbase cannot find an existing table

I set up a hbase cluster to store data from opentsdb. Recently due to reboot of some of the nodes, hbase lost the table "tsdb". I can still it on hbase's master node page, but when I click on it, it gives me a…
Sheng
  • 1,697
  • 4
  • 19
  • 33
17
votes
2 answers

Difference between String.getBytes() and Bytes.toBytes(String data)

I'm writing a Hadoop/HBase job. I needed to transform a Java String into a byte array. Is there any differences between Java's String.getBytes() and Hadoop's Bytes.toBytes()?
victorunique
  • 310
  • 2
  • 3
  • 12
17
votes
1 answer

./bootstrap: 17: exec: autoreconf: not found : OpenTSDB installation

I am trying to install OpenTSDB on Ubuntu, and I am following this documentation. But after running these commands: git clone git://github.com/OpenTSDB/opentsdb.git cd opentsdb running this commanding is giving the following console…
Bharthan
  • 1,458
  • 2
  • 17
  • 29
17
votes
7 answers

org.apache.hadoop.hbase.PleaseHoldException: Master is initializing

I am trying to setup the multinode cluster of Hbase. When i do the jps on slave i get 5780 Jps 5558 HQuorumPeer 5684 HRegionServer 1963 DataNode 2093 TaskTracker similarly on master i get 4254 SecondaryNameNode 15226 Jps 14982 HMaster 3907…
Naresh
  • 5,073
  • 12
  • 67
  • 124
16
votes
9 answers

How to clear a table in hbase?

I want to empty a table in hbase... eg: user. Is there any command or function to empty the table without deleting it... My table structure is : $mutations = array( new Mutation( array( 'column' =>…
Micku
  • 550
  • 4
  • 9
  • 23
16
votes
3 answers

HBase getting all timestamped values for a cell

i have the following scenario in my hbase instance hbase(main):002:0> create 'test', 'cf' 0 row(s) in 1.4690 seconds hbase(main):003:0> put 'test', 'row1', 'cf:a', 'value1' 0 row(s) in 0.1480 seconds hbase(main):004:0> put 'test', 'row2', 'cf:b',…
FUD
  • 5,114
  • 7
  • 39
  • 61
16
votes
6 answers

A script that deletes all tables in Hbase

I can tell hbase to disable and delete particular tables using: disable 'tablename' drop 'tablename' But I want to delete all the tables in the database without hardcoding the names of any of the tables. Is there a way to do this? I want to do this…
Vlad the Impala
  • 15,572
  • 16
  • 81
  • 124
16
votes
1 answer

How to connect HBase and Spark using Python?

I have an embarrassingly parallel task for which I use Spark to distribute the computations. These computations are in Python, and I use PySpark to read and preprocess the data. The input data to my task is stored in HBase. Unfortunately, I've yet…
Def_Os
  • 5,301
  • 5
  • 34
  • 63
16
votes
5 answers

Is there a way to add nodes to a running Hadoop cluster?

I have been playing with Cloudera and I define the number of clusters before I start my job then use the cloudera manager to make sure everything is running. I’m working on a new project that instead of using hadoop is using message queues to…
user1735075
  • 3,221
  • 4
  • 16
  • 16
16
votes
1 answer

Why OpenTSDB chose HBase for Time Series data storage?

I would really appreciate if somebody put some light on the choice of HBase as a data storage engine for OpenTSDB? Which other choices, such as Whisper (Graphite front-end + Carbon persistence), were considered? How is a column-oriented db such as…
Rajan
  • 739
  • 1
  • 6
  • 8
16
votes
7 answers

Repair HBase table (unassigned region in transition)

I'm a bit stuck repairing a faulty table (on Hbase 0.92.1-cdh4.0.0, Hadoop 2.0.0-cdh4.0.0) There is a region in transition that doesn't finish: Region State bf2025f4bc154914b5942af4e72ea063…
Mario
  • 1,801
  • 3
  • 20
  • 32
15
votes
1 answer

HBase & Mahout - Using HBase as a Datastore/source for Mahout - Classification

I'm working on a large text classification project and we have our text data (simple messages) stored in HBase. We have two problems, first we would like to use HBase as the source for Mahout classifiers namely Bayers and Random Forests. Second,…
NightWolf
  • 7,694
  • 9
  • 74
  • 121
15
votes
6 answers

Hbase client ConnectionLoss for /hbase error

I'm going completely crazy: Installed Hadoop/Hbase, all is running; /opt/jdk1.6.0_24/bin/jps 23261 ThriftServer 22582 QuorumPeerMain 21969 NameNode 23500 Jps 23021 HRegionServer 22211 TaskTracker 22891 HMaster 22117 SecondaryNameNode 21779…
CharlesS
  • 1,563
  • 2
  • 18
  • 31