2

I have an LZW compressor/decompressor written in C.

The initial table consists of ASCII characters and then each now string to be saved into the table consists of a prefix and a character both saved in a list as int.

My compression works but my decompression leaves some characters out.

The input:

<title>Agile</title><body><h1>Agile</h1></body></html>

The output I get (notice the missing 'e' and '<'):

<title>Agile</title><body><h1>Agil</h1></body>/html>

This is the code I use (the relevant part):

void expand(int * input, int inputSize) {    
    // int prevcode, currcode
    int previousCode; int currentCode;
    int nextCode = 256; // start with the same dictionary of 255 characters
    dictionaryInit();

    // prevcode = read in a code
    previousCode = input[0];

    int pointer = 1;

    // while (there is still data to read)
    while (pointer < inputSize) {
        // currcode = read in a code
        currentCode = input[pointer++];

        if (currentCode >= nextCode) printf("!"); // XXX not yet implemented!
        currentCode = decode(currentCode);

        // add a new code to the string table
        dictionaryAdd(previousCode, currentCode, nextCode++);

        // prevcode = currcode
        previousCode = currentCode;
    }
}

int decode(int code) {
    int character; int temp;

    if (code > 255) { // decode
        character = dictionaryCharacter(code);
        temp = decode(dictionaryPrefix(code)); // recursion
    } else {
        character = code; // ASCII
        temp = code;
    }
    appendCharacter(character); // save to output
    return temp;
}

Can you spot it? I'd be grateful.

mjv
  • 73,152
  • 14
  • 113
  • 156
Radek
  • 3,913
  • 3
  • 42
  • 37
  • 1
    Note that you should try to avoid relying on your compression until you can decompress it. In other words, if your statement that "my compression works" actually means "it reduces your size", and that's it, you shouldn't rule out a bug in that code just yet. – Lasse V. Karlsen Dec 02 '09 at 15:03
  • 3
    My compression works as using someone else's decompression on my input works. – Radek Dec 02 '09 at 15:04
  • 1
    The 8th line -> previousCode = input[0]; seems suspicious to me. You're calling appendCharacter() for output in decode(), and yet this first code will never be presented to appendCharacter() for output. Also, if inputSize is zero, input[0] may be a bad dereference. – meklarian Dec 02 '09 at 15:19
  • 1
    Is there a reason why you can't step through the code with a debugger to see why those characters are skipped? – Wim Coenen Dec 02 '09 at 15:19

1 Answers1

4

Your decode function returns the first character in the string. You need this character in order to add it to the dictionary, but you should not set previousCode to it. So your code should look like:

...
firstChar = decode(currentCode);
dictionaryAdd(previousCode, firstChar, nextCode++);
previousCode = currentCode;
...
interjay
  • 107,303
  • 21
  • 270
  • 254