0

I'm implementing a trie data structure in C for words in a large dictionary. The dictionary contains strings in the following format:

abacus
babble
cabal
....

I define and allocate memory for the trie outside of load function. load reads each word from the dictionary and insert each character into a position of array children[27] inside each node. Indices 0 to 25 are for characters a to z and apostrophe character ' at position 26.

The problem is I don't know if I should create and allocate memory for the trie top level inside or outside the calling function. I will be freeing the memory using another function unload after I finished using the trie. I cannot modify the main function so I'm not sure there will be no memory leak after I'm finished.

Here is the code:

#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

// trie data structure for 26 letters and apostrophe
typedef struct node
{
    bool is_word;
    struct node *children[27];
} node;

// allocate memory for trie top level and set them to 0
node *root = calloc(1, sizeof(node));

// loads dictionary into memory, return true if successful
bool load(const char *dictionary)
{
    // open input file
    FILE *fp = fopen(dictionary, "r");
    if (fp == NULL)
        return false;

    int p;    // trie index position
    char c;   // current char

    // scan dictionary word by word
    while (fscanf(fp, "%s", word) != EOF)
    {
        // set head pointer to point to root
        node *head = root;

        // loop through current word
        for (int i = 0; i < strlen(word); i++)
        {
            c = word[i];

            // ASCII to alphabet position, apostrophe last index
            if (isalpha(c))
                p = c - 'a';
            else if (c == '\'')
                p = 26;

            // allocate memory and point head to new node
            if (head->children[p] == NULL)
            {
                head->children[p] = calloc(1, sizeof(node));
                head = head->children[p];
            }
            // otherwise point head to children
            else if (head->children[p] != NULL)
                head = head->children[p];
        }
        // complete word, set flag to true
        head->is_word = true;
    }
    // finished
    fclose(fp);
    return true;
}
Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Kingle
  • 345
  • 3
  • 11
  • `node *root = calloc(1, sizeof(node));` shouldn't even compile. – melpomene Sep 17 '18 at 05:12
  • 1
    You can allocate the space for each node inside the `load()` function, or in a function that it calls if you prefer. You can't initialize `node *root` with the result of a function call when you define `root` outside any function (unless you cheat and use a C++ compiler, but then you shouldn't be using `calloc()` at all). – Jonathan Leffler Sep 17 '18 at 05:50
  • 1
    Any or all of the following questions might help: https://stackoverflow.com/q/13674617, https://stackoverflow.com/q/37106523, https://stackoverflow.com/q/39949116, https://stackoverflow.com/q/41283310, https://stackoverflow.com/q/42522720 — and there are likely others too. – Jonathan Leffler Sep 17 '18 at 05:55
  • @JonathanLeffler Thank you for the answer and links. I'm learning programming from CS50 on edX so I'm unfamiliar with how memory allocation works. This is only part of the problem I'm working on so I couldn't compile the code yet. – Kingle Sep 17 '18 at 13:51
  • 1
    It can make life easier if you build your program in phases, and phase 1 might well be to load the dictionary. You can then write code that dumps the content of the loaded dictionary — which may well be helpful in future anyway. You can then compile and test phase 1, before progressing onwards. It is easier to debug small parts, and make sure they're working, than to try to get everything done and then have to work out what's broken. – Jonathan Leffler Sep 17 '18 at 17:34

0 Answers0