20

From my understanding rows are inserted into HBase tables and are getting stored as regions in different region server. So, the region server stores the data

Similarly in terms of Hadoop, data is stored in the data nodes present in the hadoop cluster.

Lets say that i have HBase 0.90.6 configured on top of Hadoop 1.1.1 as follows

2 nodes - master and slave

  1. Master node acts as,
    • Hadoop - Namenode, Secondary Namenode, job tracker, data node, task tracker
    • HBase - Master, RegionServer and zookeeper.
  2. Slave node acts as,
    • Hadoop datanode and task tracker
    • HBase region server

Based on my statement if table data is stored in the region servers; then what is the role of the data nodes and region servers?

rnrneverdies
  • 15,243
  • 9
  • 65
  • 95
Manikandan Kannan
  • 8,684
  • 15
  • 44
  • 65

1 Answers1

45

Data nodes store data. Region server(s) essentially buffer I/O operations; data is permanently stored on HDFS (that is, data nodes). I do not think that putting region server on your 'master' node is a good idea.

Here is a simplified picture of how regions are managed:

You have a cluster running HDFS (NameNode + DataNodes) with replication factor of 3 (each HDFS block is copied into 3 different DataNodes).

You run RegionServers on the same servers as DataNodes. When write request comes to RegionServer it first writes changes into memory and commit log; then at some point it decides that it is time to write changes to permanent storage on HDFS. Here is were data locality comes into play: since you run RegionServer and DataNode on the same server, first HDFS block replica of the file will be written to the same server. Two other replicas will be written to, well, other DataNodes. As a result RegionServer serving the region will almost always have access to local copy of data.

What if RegionServer crashes or RegionMaster decided to reassign region to another RegionServer (to keep cluster balanced)? New RegionServer will be forced to perform remote read first, but as soon as compaction is performed (merging of change log into the data) - new file will be written to HDFS by the new RegionServer, and local copy will be created on the RegionServer (again, because DataNode and RegionServer runs on the same server).

Note: in case of RegionServer crash, regions previously assigned to it will be reassigned to multiple RegionServers.

Good reads:

  • Tom White, "Hadoop, The Definitive Guide" has good explanation of HDFS architecture. Unfortunately I did not read original Google GFS paper, so I cannot tell if it is easy to follow.

  • Google BigTable article. HBase is implementation of Google BigTable, and I found that architecture description in this article is the easiest to follow.

Here is nomenclature differences between Google Bigtable and HBase implementation (from Lars George, "HBase, The Definitive Guide"):

  • HBase - Bigtable
  • Region - Tablet
  • RegionServer - Tablet server
  • Flush - Minor compaction
  • Minor compaction - Merging compaction
  • Major compaction - Major compaction
  • Write ahead log - Commit log
  • HDFS - GFS
  • Hadoop MapReduce - MapReduce
  • MemStore - memtable
  • HFile - SSTable
  • Zookeeper - Chubby
nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Yevgen Yampolskiy
  • 7,022
  • 3
  • 26
  • 23
  • I could even see the HBase tables created on the hdfs and looks like chunks are stored. The link http://hbase.apache.org/book/regionserver.arch.html states that "HRegionServer is the RegionServer implementation. It is responsible for serving and managing regions. In a distributed cluster, a RegionServer runs on a Section 9.9.2, “DataNode”." But still, i have difficulties in understanding the role of region server. What kind of I/O operations and why separate regions servers are required just for those IO? – Manikandan Kannan Dec 07 '12 at 05:46
  • 7
    Region is data in some range of rows. Say, you want to get a row from HBase table. You request will get to RegionServer which is responsible for region containing your row. RegionServer will either already contain your row in memory (caching), or it needs to read it from HDFS (dataNodes). If your RegionServer runs on DataNode containing corresponding region then this is a local filesystem read. Otherwise this is a remote read, which is slow. This is why you want to place RegionServer on DataNode -- data locality principle. --- For HDFS/DataNodes see Hadoop books (say, hadoopbook.com) – Yevgen Yampolskiy Dec 07 '12 at 10:28
  • Thanks a lot for the explanation...but questions still popup 1. How is the mapping between region servers and the data nodes done? i.e. lets say i have 3 region servers rs1 on the same machine as dn1, rs2 on dn2 and rs3 on dn3. What dictates the range of rs1 to reside on dn1? My understanding is that rs1's region can go to dn2 as well. Then how is the locality achieved? 2. Should there equal number of region servers and data nodes? – Manikandan Kannan Dec 08 '12 at 14:08
  • 1
    I expended my answer. I would run as many RegionServers as dataNodes, however it is not strictly required. Regions can be re-distributed between RegionServers which helps keeping cluster balanced. – Yevgen Yampolskiy Dec 09 '12 at 12:55
  • What happens if RegionServer crashes and data is not yet written to DataNode? Will I lose the data? If some of RegionServers become dead does it affect data consistency? – timurb Jun 10 '15 at 09:11
  • As per my understanding, write comes directly to a `RS` and then it get written to `WAL` and `Memstore` and eventually to `HFile`. Since `WAL` and `HFile` are on HFDS does it mean that the `RS` send the data to `Namenode` first and namenode then replicate it to other two `RS` – Richeek Sep 30 '16 at 18:01