Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
1 answer

Spark Yarn Cluster connection to Hbase error

I have an app that parses vcf files and inserts data into hbase. The app runs when using master local with no issue using apache spark, but when I run it using apache spark yarn cluster, I get a fail with following: 17/03/31 10:36:09 INFO…
Fbkk
  • 89
  • 1
  • 4
  • 7
2
votes
2 answers

Connecting to apache phoenix using JDBC and Java

I have a small java program in which I try to establish a connection to a remote Phoenix server I have running. package jdbc_tests; import java.sql.Connection; import java.sql.DriverManager; import java.sql.ResultSet; import…
Zeliax
  • 4,987
  • 10
  • 51
  • 79
2
votes
0 answers

is it possible to get the Region and ID of the row of a column in a UDF?

I want to get the row key and Region of the column that is being "operated" on after calling getChildren().get(0); public boolean evaluate(Tuple tuple, ImmutableBytesWritable ptr) { Expression arg = getChildren().get(0); if…
Cheyenne Forbes
  • 491
  • 1
  • 5
  • 15
2
votes
3 answers

How to drop a column from a columnfamily in Hbase?

To drop a column family, we have below commands. hbase> disable tablename hbase> alter 'tablename',{NAME=>'COLFAM NAME',METHOD=>'delete} If there is a columnfamily: 'empdetails' in a table 'emptable' with columns: 'col1,col2', is there a way to…
Metadata
  • 2,127
  • 9
  • 56
  • 127
2
votes
1 answer

hbase skip region server to read rows directly from hfile

Am attempting to dump over 10 billion records into hbase which will grow on average at 10 million per day and then attempt a full table scan over the records. I understand that a full scan over hdfs will be faster than hbase. Hbase is being used…
sunny
  • 824
  • 1
  • 14
  • 36
2
votes
0 answers

Inefficient HBase record reader

I made some profiling for my MR job and found that fetching next records for table scan takes ~30% of time spent in mapper. As far as I understand, scanner fetches N rows from server as configured by scan.setCaching and then iterates them locally.…
AdamSkywalker
  • 11,408
  • 3
  • 38
  • 76
2
votes
1 answer

Loading RDF File to Hbase

i am trying to load the contents of an RDF File (Subject,Predicate,Object) to a table in HBase. So far i cannot understand how the contents of the file can be passed to the map method of the mapper class and to be stored in Hbase. Please provide…
2
votes
1 answer

Getting all columns from get result in HBase dynamically

I am working on a Get object as retrieved from a table in Habse. I want to dynamically retrieve all column values related to that get since I don't know the exact name of column families val result1 = hTable.get(g) if (!result1.isEmpty)…
Luckylukeeee
  • 87
  • 1
  • 7
2
votes
1 answer

Wrting to HBase (MapRDB) from dataframe in Spark 2

I am trying to write a CSV file to an Hbase table in Spark 2.0 on a Mapr Platform (5.2.0). My program is as follow: import org.apache.hadoop.hbase.{HBaseConfiguration, HColumnDescriptor, HTableDescriptor, TableName} import…
Luckylukeeee
  • 87
  • 1
  • 7
2
votes
1 answer

CSV Class Not Found exception

I have a CSV file which is uploaded in hdfs. I am using opencsv parser for reading the data. I have my jar file in the hadoop classpath also and its uploaded in hdfs in the following location /jars/opencsv-3.9.jar. The error i am getting is also…
2
votes
0 answers

Writing to HBase table from pyspark

The topic is fully covered in example from apache: import sys from pyspark import SparkContext """ Create test table in HBase first: hbase(main):001:0> create 'test', 'f1' 0 row(s) in 0.7840 seconds > hbase_outputformat test row1 f1 q1…
2
votes
0 answers

HBase and Phoenix: the data queried for the first time is very slow

We have some readings in a table (20M reads) with 12 columns (most of them numbers, so it is not very big). The table is created with phoenix and its key is serialnumber+timestamp. The table is compressed ('GZ'). It has a salt of 4. The HBase…
2
votes
2 answers

HBase shell command to get the size of a particular table

How can I get the size of a particular HBase table from HBase shell?
ankit tyagi
  • 778
  • 2
  • 16
  • 27
2
votes
1 answer

java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster

When I start-hbase.sh HMaster and HregionServer are coming up but not visible after some time. By looking at logs I found this. HMaster: java.lang.RuntimeException: Failed construction of Master: class org.apache.hadoop.hbase.master.HMaster at…
Vishwanath560
  • 23
  • 1
  • 6
2
votes
0 answers

HBase salting and effective data retrieval on range scans

In order to avoid hot spotting of region servers in HBase it is advised to avoid sequential row keys. One of the approaches is to salt the first byte of the row key. I want to employ this technique in my client code. Lets say I have n number of…
Ihor M.
  • 2,728
  • 3
  • 44
  • 70