Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
0 answers

Track/view hbase compaction job/status in Hbase shell

I have run the Major compaction for a particular region in hbase shell. (within the terminal , with help of Hmaster node, via elinks i track if the Table content is Major or None) but now i want to look into the percentage being done. and is it…
Big-BoB
  • 81
  • 2
2
votes
3 answers

IntellijIdea - Disable Info Message when running Spark Application

I'm getting so many message when running application that using Apache Spark and Hbase/Hadoop Library. For Example : 0 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate…
questionasker
  • 2,536
  • 12
  • 55
  • 119
2
votes
1 answer

Scanner caching and next method in ResultScanner interface

This is an excerpt from the book, HBase in Action, for Scanner caching. The ResultScanner interface also has a next(int) call that you can use to ask it to return the next n rows from the scan. This is an API convenience that doesn’t have any …
Viraj
  • 777
  • 1
  • 13
  • 32
2
votes
2 answers

How to copy hbase table from hbase-0.94 cluster to hbase-0.98 cluster

We have a hbase-0.94 cluster with hadoop-1.0.1. We don't want to have downtime for this cluster while upgrading to hbase-0.98 with hadoop-2.5.1 I have provisioned another hbase-0.98 cluster with hadoop-2.5.1 and want to copy hbase-0.94 tables to…
Santosh Kumar
  • 761
  • 5
  • 28
2
votes
1 answer

How to create external hive table with complex data types which points to hbase table?

I have a hbase table with Column families (Name, Contact) and columns, Name(String), Age(String), workStreet(String), workCity(String), workState(String). I want to create an external hive table which points to this hbase table with following…
sen
  • 198
  • 2
  • 9
2
votes
2 answers

How to count all rows on Hbase table using Scala

we can count all rows, using hbase shell with this command : count 'table_name', INTERVAL=> 1 or just simple count 'table_name. But How to do this using Scala Programming ?
questionasker
  • 2,536
  • 12
  • 55
  • 119
2
votes
1 answer

Determine exactly files from HDFS to load into one Hbase table using java?

I'm new at Big Data and Hadoop. I'm learning Hadoop and Hbase. I got a problem but still had no idea. Could you help me? I've put 3 csv files to HDFS, include: - File 1(Subscribe_info.txt): numID, active_date, status - File 2(Recharge.txt): numID,…
Bing Farm
  • 75
  • 1
  • 8
2
votes
1 answer

Cannot create Spark Phoenix DataFrames

I am trying to load data from Apache Phoenix into a Spark DataFrame. I have been able to successfully create an RDD with the following code: val sc = new SparkContext("local", "phoenix-test") val sqlContext = new…
Soto
  • 611
  • 6
  • 19
2
votes
0 answers

Escaping separator in data while bulk loading using importtsv tool and ingesting numeric values

I am using importtsv tool to ingest data. I have some doubts. I am using hbase 1.1.5. First does it ingest non-string/numeric values? I was referring this link detailing importtsv in cloudera distribution. It says:"it interprets everything as…
Mahesha999
  • 22,693
  • 29
  • 116
  • 189
2
votes
0 answers

Hbase rest api and basic authentication

I'd like to setup hbase rest server to use basic authentication but all the documentations I find are about kerberos. Does anybody know how to do so? Is it even possible? Thanks for your help!
padmalcom
  • 1,156
  • 3
  • 16
  • 30
2
votes
2 answers

How to obtain Phoenix table data via HBase REST service

I created a HBase table using the Phoenix JDBC Driver in the following code snippet: Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); Connection conn = DriverManager.getConnection("jdbc:phoenix:serverurl:/hbase-unsecure"); …
D. Müller
  • 3,336
  • 4
  • 36
  • 84
2
votes
2 answers

Querying billions of records in real time

I am working on a data driven analytical software project that produces reports and recommendation on financial data (transactions). The data consists of 1.7 billion records with 200k new records added every day. Each record describes a transaction…
Steven M
  • 574
  • 4
  • 20
2
votes
1 answer

Saving protobuf in Hbase/HDFS using Spark streaming

I am looking to store the protobuf messages in Hbase/HDFS using spark streaming. And I have below two questions What is the efficient way of storing huge number of protobuf messages and the efficient way of retrieving them to do some analytics? For…
2
votes
1 answer

Multiple rows insertion in HBase using MapReduce

I want to insert N rows to HBase table from each mapper in batch. I am currenly aware of two ways of doing this: Create a list of Put objects and use put(List puts) method of HTable instance and also make sure to disable autoFlush…
hp36
  • 269
  • 1
  • 6
  • 20
2
votes
1 answer

Using HBase for small dataset and big data analysis at the same time?

I am building an application which requires lot of data processing and analytics (processing tons of files at same time ). I am planing to use Hadoop (Map-reduce , Hbase(HDFS file system)) for this. At same time i have small dataset like user…
Pradeep Jaiswar
  • 1,785
  • 7
  • 27
  • 48