How should one go about selecting the size of HFiles in an HBase setup. Most of the guidelines say that sizes between 8k and 1MB should be considered, but I have not found a clear way of selecting the size of the HFile based on the amount of data you are storing.
Asked
Active
Viewed 269 times
1 Answers
0
"between 8k and 1MB", this is the size of a block size in Hbase (ex: BLOCKSIZE => '65536'), not the size of the region file which is between 64MB and 3-4GB (ex: MAX_FILESIZE => '134217728'). These two propertiesare setted when creating the table:
create 'table1', {NAME => 'cf1', VERSIONS => 5, MAX_FILESIZE => '134217728', BLOCKSIZE => '65536' }
You can alter them later but the old files will not be modified with the new properties.
Have a look here for more info: http://wiki.apache.org/hadoop/Hbase/Shell

Vincent Devillers
- 1,628
- 1
- 11
- 17