Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
1 answer

doInTable(HTableInterface table) is Deprecated

What can I use instead of doInTable(HTableInterface table) which is deprecated. Below is the code. hbaseTemplate.execute(tableName, new TableCallback() { public User doInTable(HTableInterface table) throws Throwable { Put p =…
2
votes
1 answer

Hbase: Auto increment of column

I'm new to Hbase. Need help, I have a table with some data in Hbase. Id Name Address 1 john XX-XX 2 mike XXX-XX and Id should auto increment. Now I have to insert data into the table like if we insert 10 records the Id should increment to 12…
user7183867
2
votes
1 answer

Phoenix not returning any rows from HBASE view

I have a table in hbase that definitely contains data: scan "my_table", {LIMIT=>1} 000008d624784f434ea441eb930eb84e201511162015111624000024498 column=g:orig_ccy, timestamp=3688201677984955, value=XXX However, after creating a view over the top of…
undershock
  • 754
  • 1
  • 6
  • 26
2
votes
1 answer

HBase, Hadoop : How can I estimate the size of a HBase table or Hadoop File System Paths?

I have multiple HBase tables, how can I estimate the approximate size of the tables using in java?
Anuranjan
  • 134
  • 1
  • 1
  • 8
2
votes
1 answer

How to get a latest sample record from hbase table (in last n hours)?

There are huge no of records in a large hbase transactional table. From Hbase shell: How to get a sample record which was inserted/ updated in last 6 hours? Is it possible to get the count of inserted/ updated records in last 6 hours?
Vijay Innamuri
  • 4,242
  • 7
  • 42
  • 67
2
votes
0 answers

Thrift 0.9.0 not generating python code on ubuntu 14

Using Ambari 2.2.2.0 (HDP 2.4.2.0-258), which contains HBASE 1.1.2.2.4.2.0-258. Updating a python client developed for an earlier version of HBase. Using Thrift 0.9.0, Neither hbase1.thrift or hbase2.thrift generate the gen-py directories (or…
2
votes
0 answers

Range in Hbase FuzzyRowFilter?

I have some data in Hbase. The structure of the key is like this userID(Integer)+dateTimeInMillis(Long). I used the following code in the past to get the rows between the range: Scan scan = new Scan(startKey.array(),…
Hammad
  • 177
  • 1
  • 10
2
votes
1 answer

Create RDD based on a part of HBase rows

I'm trying to create RDD based on data from HBase table: val targetRDD = sparkContext.newAPIHadoopRDD(hBaseConfig, classOf[TableInputFormat], classOf[ImmutableBytesWritable], classOf[Result]) .map { case (key, row) => parse(key, row) …
Aliaxander
  • 2,547
  • 4
  • 20
  • 45
2
votes
1 answer

Putting Multiple column names from a HBase Table into one SparkRDD

I have to put multiple column families from a table in HBase into one sparkRDD. I am attempting this using the following code: (question edited after first aanswer) import org.apache.hadoop.hbase.client.{HBaseAdmin, Result} import…
Ravi Ranjan
  • 353
  • 1
  • 6
  • 22
2
votes
2 answers

writing SparkRDD to a HBase table using Scala

I am trying to write a SparkRDD to HBase table using scala(haven't used before). The entire code is this : import org.apache.hadoop.hbase.client.{HBaseAdmin, Result} import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor} import…
Ravi Ranjan
  • 353
  • 1
  • 6
  • 22
2
votes
3 answers

Using subprocess for accessing HBase

I'm trying simple commands to access HBase through subprocess in Python. The following code gives me the wrong output: import subprocess cmd=['hbase','shell','list'] subprocess.call(cmd) Instead of giving me the list of tables in HBase, I get the…
anonymous
  • 405
  • 8
  • 22
2
votes
1 answer

Apache Phoenix Join Fails (Encountered exception in sub plan [0] execution)

Here are the create table statements of the tables I'm testing, which was actually from Phoenix CREATE TABLE Test.Employee( Region VARCHAR NOT NULL, LocalID VARCHAR NOT NULL, Name VARCHAR, StartDate DATE, CONSTRAINT pk PRIMARY KEY(Region,…
nardqueue
  • 31
  • 5
2
votes
2 answers

How to Randomly display the rows in HBase Shell?

I am querying HBase to get a set of key and value using the limit clause. Here is the query hbase(main):015:0> scan 'sample_table', {FILTER => "KeyOnlyFilter()",TIMESTAMP => 11, LIMIT => 2} and I get some output. If I repeat the same query I get…
Alex Raj Kaliamoorthy
  • 2,035
  • 3
  • 29
  • 46
2
votes
1 answer

Access Hbase With Hiveserver2 Error

I use hue to execute hive sql show tables; everything is ok. But executed hive sql select * from tablea limit 1; and got the exception: java.net.SocketTimeoutException:callTimeout=60000, callDuration=68043: row 'log,,00000000000000' on table…
Daniel
  • 21
  • 1
2
votes
1 answer

Some problems about serialization when i use spark read from hbase

I want to implement a class have a function that read from hbase by spark, like this: public abstract class QueryNode implements Serializable{ private static final long serialVersionUID = -2961214832101500548L; private int id; private int…
yaowin
  • 23
  • 3