My goal is to store hundreds of individual files as efficiently as possible and to read by using Java 1.6. The files consist of an average of 125,000 numbers. Some of the files contain a few hundred numbers, some more than 7,000,000. In most cases, numbers of the range from 0 to 255, 1 byte that can be stored. In some cases, numbers of the range 0-1024, 2 bytes.
To save the data I use the BZip2 implementations of Apache. But BZip2 can only store numbers that are no more than 1 byte in size. That's why I wrote a class that divides a sequence of integers in bits and combines 8 bits to 1 byte. These bytes are then written into the CBZip2InputStream (BZip2 OutputStream). The combination of both algorithms worked quite well. Unfortunately, my algorithm is very slow in reading. The table below shows the time in milliseconds it took to read files with 125,000 numbers.
| Gzip | BZip2 | UTF-8 | my algorithm |
| 47 | 28 | 35 | 1008 |
| 37 | 12 | 13 | 856 |
| 25 | 11 | 10 | 845 |
| 25 | 12 | 5 | 862 |
My algorithm is about 56 times slower than BZip2.
Is there another way to compress efficiently numbers consisting of more than 8 bits. In particular, the reading speed should be most important. The read speed should only 2 to 4 be times higher in similarly high compression as BZip2. If there is no other way will post my source code and explain, as necessary to optimize this.