I'm trying to take a stream of data from standard in, compress it one 128 byte block at a time, and then output it to standard out. (Example: "cat file.txt | java Dict | gzip -d | cmp file.txt", where file.txt just contains some ASCII characters.)
I also need to use a 32 byte dictionary taken from the end of each previous 128 byte block, for each subsequent block. (The first block uses its own first 32 bytes as its dictionary.) When I don't set the dictionary at all, the compression works fine. However, when I do set the dictionary, gzip gives me an error trying to decompress the data: "gzip: stdin: invalid compressed data--crc error".
I've tried adding/changing several parts of the code, but nothing has worked so far, and I haven't had any luck finding solutions with Google.
I've tried...
- Adding "def.reset()" before "def.setDictionary(b)" near the bottom of the code does not work.
- Only setting the dictionary for blocks after the first block does not work. (Not using a dictionary for the first block.)
- Calling updateCRC with the "input" array before or after compressor.write(input, 0, bytesRead) does not work.
I'd really appreciate any suggestions - is there anything obvious I'm missing or doing wrong?
This is what I have in my Dict.java file:
import java.io.*;
import java.util.zip.GZIPOutputStream;
public class Dict {
protected static final int BLOCK_SIZE = 128;
protected static final int DICT_SIZE = 32;
public static void main(String[] args) {
InputStream stdinBytes = System.in;
byte[] input = new byte[BLOCK_SIZE];
byte[] dict = new byte[DICT_SIZE];
int bytesRead = 0;
try {
DictGZIPOuputStream compressor = new DictGZIPOuputStream(System.out);
bytesRead = stdinBytes.read(input, 0, BLOCK_SIZE);
if (bytesRead >= DICT_SIZE) {
System.arraycopy(input, 0, dict, 0, DICT_SIZE);
compressor.setDictionary(dict);
}
do {
compressor.write(input, 0, bytesRead);
compressor.flush();
if (bytesRead == BLOCK_SIZE) {
System.arraycopy(input, BLOCK_SIZE-DICT_SIZE-1, dict, 0, DICT_SIZE);
compressor.setDictionary(dict);
}
bytesRead = stdinBytes.read(input, 0, BLOCK_SIZE);
} while (bytesRead > 0);
compressor.finish();
}
catch (IOException e) {e.printStackTrace();}
}
public static class DictGZIPOuputStream extends GZIPOutputStream {
public DictGZIPOuputStream(OutputStream out) throws IOException {
super(out);
}
public void setDictionary(byte[] b) {
def.setDictionary(b);
}
public void updateCRC(byte[] input) {
crc.update(input);
}
}
}