12

What is the difference between Store file and HFile??

I have basic idea about compaction i.e. store files are merged together to reduce seeks from the disk.

Is it correct?? Can someone explain more about Compaction like the exact process and how it works?

Srinu Katta
  • 155
  • 1
  • 9

2 Answers2

19

Store File and HFile are synonyms, equivocally used to define the same concept.

When something is written to HBase, it is first written to an in-memory store (memstore), once this memstore reaches a certain size, it is flushed to disk into a store file (everything is also written immediately to a log file for durability). The store files (or HFiles) created on disk are immutable. Sometimes the store files are merged together, this is done by a process called compaction.

For more information with statistics, see here. Happy Learning

Mehdi LAMRANI
  • 11,289
  • 14
  • 88
  • 130
Ramzy
  • 6,948
  • 6
  • 18
  • 30
  • Thank you @Ramzy . Is store file and HFile same? – Srinu Katta Jul 24 '15 at 02:24
  • 3
    yes. [This](http://hbase.apache.org/0.94/book/regions.arch.html) and [this](http://www.slideshare.net/enissoz/hbase-and-hdfs-understanding-filesystem-usage) will give you a high level picture. – Ramzy Jul 24 '15 at 02:51
3

When the MemStore reaches a given size (hbase.hregion.memstore.flush.size), it flushes its contents to a StoreFile. The number of StoreFiles in a Store increases over time. Compaction is an operation which reduces the number of StoreFiles in a Store, by merging them together, in order to increase performance on read operations. Compactions can be resource-intensive to perform, and can either help or hinder performance depending on many factors.

Compactions fall into two categories: minor and major.