I have read from wikipedia (Chunk based formats) on what a chunk based data format is but have a few questions to clarify in regards to where the "header" (non-data) part of the file lives.
I can think of two approaches, one where there is a single header which describes where every chunk lives and from wikipedia the information to achieve this can be by
start- and end-markers of some kind, by an explicit length field somewhere, or by fixed requirements of the file format's definition
which would presumably all live in the header. such as:
Header
Number of entries: 2
Byte of each element: 1
Data
'H''E'
Followed by the data where each data is n-bytes long and there are number of entries of them.
I can see this as an advantage as the header can give you access to any element you want immediately. However the disadvantage is that each chunk is not self-contained.
The second approach which could be taken is to have a MAIN header which contains some but not all of the information and each chunk itself contains a (header, data) pair making it self contained.
Variable chunks based on IDs -- minimal header
Number of offsets (each are 64 bits wide)
Offset of chunk ID 1
Offset of chunk ID 2
Offset of chunk ID 3
(Note: if chunk ID 1 contains more mini-chunks then this is not noted here)
Element
chunk ID 1
Number of elements
Number of bytes per element
....Data....
Element
chunk ID 2
Number of elements
Number of bytes per element
....Data....
The number of bytes could also be encoded within the program reading the format as opposed to the file format based on the ID. This would then only need the ID and the number of elements which could also make the structures be recursive or hierarchical as each element can then be a variable size depending on what its own header says.
What is it that makes it chunk based?