Questions tagged [hbase]

HBase is the Hadoop database (columnar). Use it when you need random, real time read/write access to your Big Data. This project's goal is the hosting of very large tables -- billions of rows X millions of columns -- atop clusters of commodity hardware.

HBase is an open source, non-relational, distributed,versioned, column-oriented database modeled after Google's Bigtable and is written in Java. Bigtable: A Distributed Storage System for Structured by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, HBase provides Bigtable-like capabilities on top of Hadoop Distributed File System(HDFS). HBase includes: It is developed as part of Apache Software Foundation's Apache Hadoop project and runs on top of HDFS (Hadoop Distributed File System), providing Bigtable-like capabilities for Hadoop.

  • Convenient base classes for backing Hadoop MapReduce jobs with HBase tables including cascading, hive and pig source and sink modules
  • Query predicate push down via server side scan and get filters
  • Optimizations for real time queries
  • A Thrift gateway and a REST-ful Web service that supports XML, Protobuf, and binary data encoding options
  • Extensible jruby-based (JIRB) shell
  • Support for exporting metrics via the Hadoop metrics subsystem to files or Ganglia; or via JMX
6961 questions
2
votes
1 answer

Hbase: Scan with column filter(Get rows which does have a particular column)

I am trying to fetch rows using scan . I need those rows where a particular column is not present. I have tried multiple approaches but none seems to be working. Let say I want rows where column "fs" is not present. I have tried the…
Peter
  • 2,719
  • 4
  • 25
  • 55
2
votes
1 answer

Eviction of Ignite Cache Entries at particular time and storage to HBASE

I am storing entries in IgniteCache, and After each time interval (lets say 1 Hr), entries stored in that Hour should get Evicted and Stored to Hbase. How can I achieve this? I tried as follows.…
iamLalit
  • 448
  • 3
  • 20
2
votes
1 answer

Apache Phoenix Create statement as select (from)

I am trying to create a new table from an existing structure in Phoenix. Is there a CREATE as Select statement in Phoenix. I am trying and they are failing with the below exception. Any suggestions here are welcome. Thanks in advance. CREATE TABLE…
2
votes
0 answers

Pyspark : GET row from hbase using row-key

I have a use-case to read from HBase inside a pyspark job and is currently doing a scan on the HBase table like this, conf = {"hbase.zookeeper.quorum": host, "hbase.cluster.distributed": "true", "hbase.mapreduce.inputtable": "table_name",…
void
  • 2,403
  • 6
  • 28
  • 53
2
votes
1 answer

Handling Images,Video and audio types using hbase

Anybody have any idea about,How to handle unstructured data like Audio,Video and Images using Hbase.I tried for this alot but i didn't get any idea.please any help is appreciated.
user6608138
  • 381
  • 1
  • 4
  • 20
2
votes
1 answer

Image/Video on HBASE and made available via some sort of Http URL for access

I want to store some videos[binary] files on HBase,and made available via some sort of Http URL for access. Can someone help me with the architecture/design for such uses cases. I have seen below links, mostly referring to HDFS; Is HDFS better for…
nilesh1212
  • 1,561
  • 2
  • 26
  • 60
2
votes
1 answer

scala connect hbase master failure

I write Scala code as below: 44 val config: Configuration = HBaseConfiguration.create() 45 config.set("hbase.zookeeper.property.clientPort", zooKeeperClientPort) 46 config.set("hbase.zookeeper.quorum", zooKeeperQuorum) 47 …
张余乐
  • 31
  • 3
2
votes
1 answer

Load data in HBase from HDFS without using Pig Script

I have .csv files in HDFS. I want to load these in HBASE tables without using Pig script. Is there any other way available?
Avijit
  • 1,770
  • 5
  • 16
  • 34
2
votes
2 answers

Hbase, Region Servers, Storefile Size, Indexes

Do you use compression with your indexes tables in Hbase? If so, what type of compression do you use? I have noticed that the size of my indexes tables are every big, and grow each day... After adding new storage, the size is even bigger. I have…
user5688790
2
votes
1 answer

doing a ValueFilter and Count values on hbase shell

I am working with HBase Shell and was wondering if it is possible to count the values which the following scan command filters? scan 'table', { COLUMNS => 'cf:c', FILTER => "ValueFilter( =, 'substring:myvalue' )" } It should display the sum on the…
sara95
  • 23
  • 3
2
votes
1 answer

Spark Streaming : source HBase

Is it possible to have a spark-streaming job setup to keep track of an HBase table and read new/updated rows every batch? The blog here says that HDFS files come under supported sources. But they seem to be using the following static API…
void
  • 2,403
  • 6
  • 28
  • 53
2
votes
2 answers

Is there any concept of auto commit in hbase?

I am new to hbase and want to learn more. I just want to know if there is any auto commit concept available in HBASE?
Sameer Bhand
  • 43
  • 1
  • 9
2
votes
1 answer

Does TitanDB loads full graph into memory when using g.V() from Tinkerpop?

I‘m using Titan now. I want to use "g.V().values()" supported by Tinkerpop in my Titan application,achieving a graph traversal. In my view, Tinkerpop loads global graph into memory when using this iterator.Titan seems like call this method directly…
Andrew Lee
  • 75
  • 6
2
votes
1 answer

Deleting cells of a row in HBase

I am new to HBase and I am creating a large table. My table periodically is being scanned and some data related to some row are being deleted. I wanted to know if for a specific row , I delete some columns of that row, it decrease the amount of disk…
mahdi62
  • 959
  • 2
  • 11
  • 17
2
votes
2 answers

Hbase multiple partial rowkeys scan

I am trying find a solution to scan a Hbase table, with multiple partial keys from the same rowkey. example: RowKey: account_id|name|age|transaction_date 12345|abc |50 |2016-05-05 08:10:10 Here I want to scan a hbase table to get all…