In all the HBase articles and books it mentions the following about the Meta and FileInfo blocks in HFiles:-
"The Meta block is designed to keep a large amount of data with its key as a String, while FileInfo is a simple Map preferred for small information with keys and values that are both byte-array. " OR "Metadata blocks are expensive. Fill one with a bunch of serialized data rather than do a metadata block per metadata instance. If metadata is small, consider adding to file info"
I want to understand why it says that. What is the design logic because of which large data should be kept in Meta while small in FileInfo
.
The reason I want to know this is that we store some information in the FileInfo in our project. However, over time the information we store started growing and we now have upto 15-20MB of data in FileInfo. From the above text it seems we should not be doing this. But we don't even know what impact, if any, it is causing to our system.
Can someone please shed some light on this. I've looked at the HFile
and FileInfo
code and couldn't find any obvious reason.