0

Right here is the function itself. I'm having a segfault in there because apparently I'm unable to assign the string to that value in the array. clang/gcc both give me a warning. Clang's is a bit better which is "expecting char assigning char *". I don't know of any other way to work with that dictionary, as everything I've tried isn't working. I'm also going to include all helper functions for it, but I'm pretty sure it's in this function itself.

As per usual, I'll upvote any answer that works, and I'll accept the one that I personally chose. Anyway I'm going to post the rest of the "helper" functions below.

void lzw_compress(char *uncompressed_data,size_t uncompressed_length,char *out){
unsigned long i=0,j=0;
char *character=malloc(1);
char *word=malloc(65535);
char *word_character=malloc(65535);
unsigned long word_size=0;
long *tmp_buffer=malloc(65535);
char *dictionary=malloc(130000);
for(i=0;i<=255;++i){
    dictionary[i]=base_dictionary[i];
}
long index=0;  
unsigned long dictionary_size=256;
for(i=0;i<uncompressed_length;++i){
    character[0]=(unsigned char )uncompressed_data[i];
    //arrcat(word_character,word,word_size,character);
    for(j=0;j<word_size;++j){
        word_character[j]=word[j];
    }
    word_character[j]=*character;
    index=search(dictionary,dictionary_size,word_character);
    if(index!=-1){
        for(j=0;j<(word_size+1);++j){
            word[j]=word_character[j];
        }
        ++word_size;
    }
    else{
        tmp_buffer[j++]=index;
        ++dictionary_size;
        //dictionary[dictionary_size++]=(unsigned long *)word_character;

            dictionary[dictionary_size]=*word_character;

        word=*character;
        word_size=1;
    }
}
if(memcmp(word,"",1)!=0){
    tmp_buffer[j++]=search(dictionary,dictionary_size,word);
}
    char *debug="";
    for(i=0;i<j;++i){
        sprintf(debug,"%s%lu,",debug,tmp_buffer[i]);
    }
    printf("%s",debug);

}

long search(char *table,unsigned long table_length,char
*search_value){
    unsigned long i=0;
    for(i=0;i<table_length;++i){
        if(table[i]==*search_value){
            return i;
        }
    }
    return -1; 
 }

So as you can see I'm trying to do an lzw-like program in pure c. I always compile with -Wall -std=c99(because I occasionally use p99.h for preprocessor macro abuse). But for some reason I'm unable to get my array of strings to work, I know I've used code similar to it(but apparently I didn't back it up...) but anyway yeah. I'm unable to figure out how I'm supposed to be doing it(properly). I'll greatly appreciate anyone's help with this issue.

As per normal whatever code I post on here, is public domain unless otherwise stated, and once I get the entire thing working I post it here so that anyone else looking for it can get it working too.

Finally thanks for reading this thread, and for heping me(if you know how to). Once I get back after going to town(if there's already an answer), I'll check it/mark off things then. But don't let that discourage you, because yours may be a better solution than the one that I chose and you'll still get an upvote.

Edit 1:Edited the code to what it was previously(according to git).

Edit 2: Fixed up a lot of things, got it looking better. Still the array comparison function isn't working(for some odd reason).

133794m3r
  • 5,028
  • 3
  • 24
  • 37

1 Answers1

1

Now that you have the allocations, there are a few points that can be determined as errors from that:

for(i=0;i<uncompressed_length;++i){
    character[0]=(unsigned char )uncompressed_data[i];
    //arrcat(word_character,word,word_size,character);
    for(j=0;j<word_size;++j){
        word_character[j]=word[j];
    }

Initially, the memory word points to is uninitialised, and word_size is 1. So you copy the indeterminate char word[0] to word_character[0]. I'm not sure whether you should set word_size = 0 initially, or move that copying loop, or something else.

word_character[j]=character;

You're assigning a char* to a char. You probably meant word_character[j] = *character; there (or character[0] instead of *character, which is equivalent).

  dictionary[dictionary_size]=word_character;

Again assigning a char* to a char. I can't guess what you wanted here, since dictionary_size is not changed in the loop. Maybe you wanted to increment dictionary_size and copy the word_character string?

    word=character;
    word_size=1;

Here you're losing the handle to the memory that was allocated to word initially - commonly known as a memory leak - and let word point to a memory block that has enough space for one character. You probably meant to copy the pointed-to character,

word[0] = character[0];

there?


Initial answer for original code:

void lzw_compress(char *uncompressed_data,size_t uncompressed_length,char *out){
unsigned long i=0,j=0;
char *character;
char *word;
char *word_character;
unsigned long word_size=1;
long *tmp_buffer=malloc(65535);
char *dictionary;
for(i=0;i<=255;++i){
    dictionary[i]=base_dictionary[i];
}

You haven't allocated any memory for dictionary to point to, this is undefined behaviour with a non-zero segfault probability.

long index=0;  
unsigned long dictionary_size=256;
for(i=0;i<uncompressed_length;++i){
    character[0]=(unsigned char )uncompressed_data[i];

You haven't allocated memory for character either, again undefined behaviour.

    //arrcat(word_character,word,word_size,character);
    for(j=0;j<word_size;++j){
        word_character[j]=word[j];
    }

word_character and word don't point to allocated memory either, more undefined behaviour.

    word[j]=(unsigned long)character;

You're casting a char* to unsigned long and assign that value to a (non-allocated) char. Even if word[j] was valid memory, what's the intention here?

    index=search(dictionary,dictionary_size,word_character);
    if(index!=-1){
        for(j=0;j<(word_size+1);++j){
            word[j]=word_character[j];
        }
        ++word_size;
    }
    else{
        tmp_buffer[j++]=index;
        ++dictionary_size;
        //dictionary[dictionary_size++]=(unsigned long *)word_character;
      for(j=0;j<word_size;++j){
            dictionary[dictionary_size]=word_character;
       }
        word=character;
        word_size=1;
    }
}
if(memcmp(word,"",sizeof word)!=0){

sizeof word is the size of a char*. You probably intended to use the length of the string here.

    tmp_buffer[j++]=search(dictionary,dictionary_size,word);
}
    char *debug="";
    for(i=0;i<j;++i){
        sprintf(debug,"%s%lu,",debug,tmp_buffer[i]);

calling sprintf with overlapping source and destination is undefined behaviour. In this case, it is even a string literal. String literals aren't modifiable, so that's another source for undefined behaviour, and a likely crash due to trying to modify a string literal.

    }
    printf("%s",debug);

}
Daniel Fischer
  • 181,706
  • 17
  • 308
  • 431
  • I've forgotten the unsigned long bit in the code from before. I've allocated 130k of memory to dictionary, 1byte to character(as it's 1 byte only), did 65k for word, and another 65k for word_character, and I still have that segfault. I'll try to edit the above to what I was working with before. – 133794m3r Feb 18 '13 at 16:46
  • Also I was doing that because I was trying to find a way of putting the long unsigned value that is held in tmp_buffer into the debug string, that was my original intention with it. I know it was foolish but I've not touched any code in well over 7 months and I'm trying to get back into things again. Also, I've edited the above to what it "should" have been before. I'll change the sprintf into a printf, and just make it a huge long loop and see if that makes it no segfault. – 133794m3r Feb 18 '13 at 16:53
  • It's working now at least not segfaulting now I have to go through my other code and figure out why it's not working. So everything is "sorta" working. And since I only asked about the segfaulting I marked yours as the answer and gave you an upvote. – 133794m3r Feb 18 '13 at 17:04
  • 1
    Nice, thanks, but I intend to update the answer with a few more points that follow from your updated code. I'll have to leave very soon, though, so I might only finish when I return in a bit over an hour. – Daniel Fischer Feb 18 '13 at 17:06
  • 1
    @133794m3r Managed to finish what jumped to my eyes, hope those points help. – Daniel Fischer Feb 18 '13 at 17:17
  • I have a lot of errors in that code above... I'm going to go ahead and edit it... I'm just wow. I see so many things now that someone else is showing me errors... I'm going to ahead and edit it right now and see if ti fixes it. – 133794m3r Feb 18 '13 at 17:25
  • Would've edited the previous one but apparently it's the array comparison function that isn't working. If you're up to it, I'd be very grateful if you could fix that bit. (As that's all that's wrong with it still...). Essentially what I'm trying to do is write my own file compressor(because I really like compression algorithms). – 133794m3r Feb 18 '13 at 17:34