0

For context this code relates to a CS50 assignment that involves loading a bunch of words from a text-file/dictionary

#include <stdio.h>
#include <string.h>
#include <ctype.h>
#include <stdbool.h>
#include <stdlib.h>

#define LENGTH 10

unsigned int hash(const char *word);
bool ghost(const char *word);
bool memoryoverlap(FILE *ptr, const char *word);


// Represents a node in a hash table
typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
}
node;


// Number of buckets in hash table
const unsigned int N = 1;

// Hash table
//node *table[N];



int main(void)
{
    FILE *dict = fopen("dictionaries/large", "r");
   
    char c;

    long bins[45] = {0};

    int wordcount = 0;

    do
    {
        char *word = malloc(sizeof(char) * (LENGTH + 1));


        if (word == NULL)
        {
            printf("Error: Null pointer for word");
            return 2;
        }


        int i = 0;

        while ((c = fgetc(dict)) != EOF)
        {
            if (c == '\n')
            {
                break;
            }

            if (isalpha(c) || (c = '\'' && i > 0))
            {
                word[i] = c;
                i++;
            }

        }

        if (c != EOF)
        {
            word[i] = '\0';
            printf("%s\n", word);
            wordcount++;

            free(word);
        }
        else
        {
            free(word);
            break;
        }
    }
    while (true);

    fclose(dict);

    printf("%i", wordcount);
}

unsigned int hash(const char *word)
{
    int length = strlen(word);

    int sum = 0;

    for (int i = 0; i < length; i++)
    {
        sum += word[i];
    }

    return sum / length;
}

I have a couple questions about this code:

  1. Originally I just created the string word by using char word[LENGTH + 1]. I went to using malloc because I found that the integer wordcount was getting overwritten and losing count as my program was running, and I figured that it may have had something to do with me running out of memory in the stack (hence shifting to using memory from the heap with malloc). I am still a little confused if my reasoning as to why wordcount was getting overwritten is correct, however, because wordcount has a different scope than the string word and I thought that variables with smaller scopes were added to the stack in an "upward" direction (meaning that they should interfere with the heap and not variables like wordcount that were in the stack to begin with). I also do not really understand why the program would have felt the need to overwrite memory in the first place, since I thought word would just be destroyed/re-created every time the loop iterates (Does the problem come from the fact that the pointer to word is random or something and I just iterate so many times it always manages to land on the location of wordcount... I don't think this is the case but I might be wrong). Is there a way to avoid this overwrite without using malloc for word?

  2. After changing from using char word[LENGTH + 1] to malloc to make word, wordcount stopped getting overwritten and started recording the correct count. However, for some reason, my program started returning the error munmap_chunk(): invalid pointer whenever it tries to execute fclose(dict) at the end of the file. I am quite confused about this. First, my understanding is that FILE *dict = fopen(...) creates a FILE pointer using memory in the stack, so I really don't get why starting to allocate memory for word using the heap should suddenly cause problems. Second, my program never gives me munmap_chunk() errors in the loop where I am continuously reading from the dict file. In fact, I don't think the program has ever crashed on me before successfully reading out the file's word. My understanding is that the munmap_chunk(): invalid pointer error comes from accidentally altering the value of the pointer, but I cannot figure out where this alteration would occur if everything appears to work okay in the loop. I believe I have also read that uninitialized pointers can cause problems by stealing the FILE pointers memory location, but I don't think (especially now that I'm using malloc) I've left any unitialized pointers anywhere. Does anyone know how I would go about getting fclose() to work?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
pcoppi
  • 23
  • 2
  • 3
    Any word of more than 10 letters overruns into your next pointer. You don't fix that with malloc, you just overwrite a different location. – stark Jun 26 '21 at 16:13
  • 1
    Here's one problem: `if (isalpha(c) || (c = '\'' && i > 0))` You probably want `==` rather than `=` here. Also this code will fail if any word is longer than 10 characters. At the very least, you should check for buffer overflow *before* storing to `word[i]`. – Tom Karzes Jun 26 '21 at 16:43
  • You don't check that you successfully opened the file — you must **always** do that. If you don't check, `fclose()` will complain (crash) when you pass a null pointer to it. – Jonathan Leffler Jun 30 '21 at 18:19

0 Answers0