0

How should one go about selecting the size of HFiles in an HBase setup. Most of the guidelines say that sizes between 8k and 1MB should be considered, but I have not found a clear way of selecting the size of the HFile based on the amount of data you are storing.

Cornelius
  • 3,526
  • 7
  • 37
  • 49

1 Answers1

0

"between 8k and 1MB", this is the size of a block size in Hbase (ex: BLOCKSIZE => '65536'), not the size of the region file which is between 64MB and 3-4GB (ex: MAX_FILESIZE => '134217728'). These two propertiesare setted when creating the table:

create 'table1', {NAME => 'cf1', VERSIONS => 5, MAX_FILESIZE => '134217728', BLOCKSIZE => '65536' }

You can alter them later but the old files will not be modified with the new properties.

Have a look here for more info: http://wiki.apache.org/hadoop/Hbase/Shell

Vincent Devillers
  • 1,628
  • 1
  • 11
  • 17