Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
Query predicate push down via server side scan and get filters
Optimizations for real time queries
A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
Extensible jruby-based (JIRB) shell
Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX

6961 questions

votes

0 answers

Track/view hbase compaction job/status in Hbase shell

I have run the Major compaction for a particular region in hbase shell. (within the terminal , with help of Hmaster node, via elinks i track if the Table content is Major or None) but now i want to look into the percentage being done. and is it…

shell hbase

asked Aug 02 '16 at 13:23

Big-BoB

votes

3 answers

IntellijIdea - Disable Info Message when running Spark Application

I'm getting so many message when running application that using Apache Spark and Hbase/Hadoop Library. For Example : 0 [main] DEBUG org.apache.hadoop.metrics2.lib.MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate…

scala hadoop apache-spark hbase

asked Aug 01 '16 at 09:48

questionasker

2,536
12
55
119

votes

1 answer

Scanner caching and next method in ResultScanner interface

This is an excerpt from the book, HBase in Action, for Scanner caching. The ResultScanner interface also has a next(int) call that you can use to ask it to return the next n rows from the scan. This is an API convenience that doesn’t have any …

hbase

asked Jul 27 '16 at 19:50

Viraj

votes

2 answers

How to copy hbase table from hbase-0.94 cluster to hbase-0.98 cluster

We have a hbase-0.94 cluster with hadoop-1.0.1. We don't want to have downtime for this cluster while upgrading to hbase-0.98 with hadoop-2.5.1 I have provisioned another hbase-0.98 cluster with hadoop-2.5.1 and want to copy hbase-0.94 tables to…

hadoop hbase

asked Jul 26 '16 at 23:19

Santosh Kumar

votes

1 answer

How to create external hive table with complex data types which points to hbase table?

I have a hbase table with Column families (Name, Contact) and columns, Name(String), Age(String), workStreet(String), workCity(String), workState(String). I want to create an external hive table which points to this hbase table with following…

hadoop hive hbase hadoop2

asked Jul 26 '16 at 21:24

sen

votes

2 answers

How to count all rows on Hbase table using Scala

we can count all rows, using hbase shell with this command : count 'table_name', INTERVAL=> 1 or just simple count 'table_name. But How to do this using Scala Programming ?

scala hadoop hbase nosql-aggregation nosql

asked Jul 22 '16 at 10:15

questionasker

2,536
12
55
119

votes

1 answer

Determine exactly files from HDFS to load into one Hbase table using java?

I'm new at Big Data and Hadoop. I'm learning Hadoop and Hbase. I got a problem but still had no idea. Could you help me? I've put 3 csv files to HDFS, include: - File 1(Subscribe_info.txt): numID, active_date, status - File 2(Recharge.txt): numID,…

java hadoop hbase

asked Jul 14 '16 at 13:52

Bing Farm

votes

1 answer

Cannot create Spark Phoenix DataFrames

I am trying to load data from Apache Phoenix into a Spark DataFrame. I have been able to successfully create an RDD with the following code: val sc = new SparkContext("local", "phoenix-test") val sqlContext = new…

apache-spark dataframe hbase apache-phoenix

asked Jul 11 '16 at 18:17

Soto

votes

0 answers

Escaping separator in data while bulk loading using importtsv tool and ingesting numeric values

I am using importtsv tool to ingest data. I have some doubts. I am using hbase 1.1.5. First does it ingest non-string/numeric values? I was referring this link detailing importtsv in cloudera distribution. It says:"it interprets everything as…

hadoop hbase

asked Jul 07 '16 at 13:52

Mahesha999

22,693
29
116
189

votes

0 answers

Hbase rest api and basic authentication

I'd like to setup hbase rest server to use basic authentication but all the documentations I find are about kerberos. Does anybody know how to do so? Is it even possible? Thanks for your help!

rest authentication hbase

asked Jul 05 '16 at 13:49

padmalcom

1,156
3
16
30

votes

2 answers

How to obtain Phoenix table data via HBase REST service

I created a HBase table using the Phoenix JDBC Driver in the following code snippet: Class.forName("org.apache.phoenix.jdbc.PhoenixDriver"); Connection conn = DriverManager.getConnection("jdbc:phoenix:serverurl:/hbase-unsecure"); …

rest hadoop jdbc hbase apache-phoenix

asked Jun 28 '16 at 07:52

D. Müller

3,336
4
36
84

votes

2 answers

Querying billions of records in real time

I am working on a data driven analytical software project that produces reports and recommendation on financial data (transactions). The data consists of 1.7 billion records with 200k new records added every day. Each record describes a transaction…

performance redis hbase aerospike singlestore

asked Jun 27 '16 at 09:07

Steven M

votes

1 answer

Saving protobuf in Hbase/HDFS using Spark streaming

I am looking to store the protobuf messages in Hbase/HDFS using spark streaming. And I have below two questions What is the efficient way of storing huge number of protobuf messages and the efficient way of retrieving them to do some analytics? For…

apache-spark hbase hdfs protocol-buffers spark-streaming

asked Jun 22 '16 at 09:10

Lokesh Kumar P

votes

1 answer

Multiple rows insertion in HBase using MapReduce

I want to insert N rows to HBase table from each mapper in batch. I am currenly aware of two ways of doing this: Create a list of Put objects and use put(List puts) method of HTable instance and also make sure to disable autoFlush…

hadoop mapreduce hbase

asked Jun 21 '16 at 10:50

hp36

votes

1 answer

Using HBase for small dataset and big data analysis at the same time?

I am building an application which requires lot of data processing and analytics (processing tons of files at same time ). I am planing to use Hadoop (Map-reduce , Hbase(HDFS file system)) for this. At same time i have small dataset like user…

hadoop hbase hdfs

asked Jun 20 '16 at 10:20

Pradeep Jaiswar

1,785
7
27
48

Prev 1 2 3

…

99 100 Next