Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
1 answer

Displaying image from HBase in SpotFire

I've converted my images and stored on HBase as Bytes. Now i want to SpotFire to read the Image (as Bytes) from hbase and display it. I understand that i can use Phoenix connector to connect to HBase from SpotFire, but how can i render images(which…
John Thomas
  • 212
  • 3
  • 21
2
votes
1 answer

Unable to import data from Hdfs to Hbase using importtsv

I moved tab delimited file into hdfs now was trying to move it to hbase. Below is my importtsv command bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv…
Elijah
  • 306
  • 1
  • 12
2
votes
0 answers

Use of HBase Get constructor with rowOffset

What is use of this specific Get constructor in Hbase? public Get(byte[] row, int rowOffset, int rowLength) I am looking for examples to use parts(s) of composite row_key to get columns without doing full table scan. Any…
Puneet Khatod
  • 161
  • 1
  • 5
2
votes
0 answers

how to count versions in one cell by using HBase Shell?

If I set 100 version for HBase column family f1. I put some data(I don't know how many) into the same rowkey(eg: r1) and same cell f1:c1. now, I want know how many data I put to r1,f1:c1, I want count current the numbers of versions in r1,f1:c1 cell…
Guo
  • 1,761
  • 2
  • 22
  • 45
2
votes
0 answers

HBase - Connection Reset by peer Exception

I am trying to use HBase for building some real time API's. Hence my use case is to support ~10000 concurrent requests per second. I am trying to do some connection pooling so as to achieve multi thread access. I followed this documentation to…
Deepu
  • 21
  • 3
2
votes
2 answers

HRegionServer shows "error telling master we are up". Showing socket exception: Invalid argument

Iam trying to create a hbase cluster in 3 centos machines. Hadoop(v - 2.8.0) is up and running on top I configured HBase(v - 1.2.5).Hbase start up is fine it started HMaster and Region servers but still it shows the follwing error in region servers…
Pramod
  • 31
  • 5
2
votes
0 answers

Spark Job Processing Time increases to 4s without explanation

We are running a 1 namenode and 3 datanode cluster on top of Azure. On top of this I am running my spark job on Yarn-Cluster mode. Also, We are using HDP 2.5 which have spark 1.6.2 integrated into its setup. Now I have this very weird issue where…
Biplob Biswas
  • 1,761
  • 19
  • 33
2
votes
2 answers

Gremlin Server:Serving Multiple graphs from hbase table

I am using gremlin server with hbase as back-end.I read that for storing multiple graphs we have to go with distinct tables,so I have multiple graphs stored in hbase under different table names. The property storage.hbase.tablename is specified in…
Nithin A
  • 374
  • 1
  • 2
  • 18
2
votes
1 answer

my rowfilter for regexstring is wrong

I want to get all rows end with 2017-04-12. Shell command is scan 'tr_log_v2', {FILTER => "RowFilter(=, 'regexstring:*2017-04-12')"}. And an error happens Incorrect filter string RowFilter(=, 'regexstring:*2017-04-12'). What is the right way to do…
Charles
  • 53
  • 3
2
votes
1 answer

Is Hbase a columnar DB

Hbase table is based on a column family, this means that each column is a tuple Each column is stored together Does this means that HBase is not a columnar DB? Columnar DB are efficient in IO they can do better compression , since data of a single…
sami
  • 501
  • 2
  • 6
  • 18
2
votes
0 answers

org.apache.spark.rdd.NewHadoopRDD - Failed to use InputSplit#getLocationInfo

I am using Java(8) to connect spark(1.6.0) with hbase(1.2.2 zookeeper 3.4.6). , the scala code on the spark client is OK . Spark cluster, hbase cluster, zookeeper cluster are on cloud . And the code is below : package com.carelinker.spark; import…
2
votes
0 answers

Kafka Spark streaming HBase insert issues

I'm using Kafka to send a file with 3 columns using Spark streaming 1.3 to insert into HBase. This is how my HBase looks like : ROW COLUMN+CELL zone:bizert column=travail:call, timestamp=1491836364921,…
Zied Hermi
  • 229
  • 1
  • 2
  • 11
2
votes
3 answers

Can Hbase table be partitioned based on time?

I need to get data based on time range.Is there any way to partition hbase table based on the time range. Ex : I want data say from 9:00 to 9:05 .
Karthik
  • 21
  • 3
2
votes
1 answer

loading csv file to HBase through Spark

this is simple " how to " question:: We can bring data to Spark environment through com.databricks.spark.csv. I do know how to create HBase table through spark, and write data to the HBase tables manually. But is that even possible to load a…
user3521180
  • 1,044
  • 2
  • 20
  • 45
2
votes
2 answers

How to retrieve the records based on a condition from a Hbase table?

I have a Hbase table: employeedetails with column families:cols-personaldetails: firstname, lastname, professionaldetails: cols-company, empid and it has the following data in it. 1 column=personaldetails:firstname, timestamp=1490959927100,…
Metadata
  • 2,127
  • 9
  • 56
  • 127