For various reasons, I am using LZMA2 to compress many varying size blocks of data. As there are many blocks being processed in parallel, the memory usage needs to be kept to a reasonable level. Given n bytes of data, what would be the optimal dictionary size to use? Typical source blocks vary in size from 4k to 4Mb.
I speculate that there's no point in having a dictionary size larger than the number of bytes to compress? I also speculate that if the data were to compress to half the size, there would be no point have a dictionary size larger than n/2 bytes.
Of course, this is only speculation, and some insight as to why this is or is not the case would be greatly appreciated!
Cheers
John