0

So I'm implementing a trie to store words in a dictionary file. I've implemented the insert operation; now I'm trying to print lexicograpically. I'm close to getting it, but I have a small problem that I'm not sure how to fix. I'm also trying to keep in mind the speed of my program, which is why I've opted for a trie over an array or a linked list. Here is what a single node looks like:

struct node {
  int end;
  int occurrences;
  int superwords;
  struct node* child[26];
};

"end" indicates the completion of a word (for example, end == 1 at letter 'k' in the word book; this prevents confusion in checking whether or not a word has actually been inserted in the tree).

Here's the method:

void preorder(struct node *follow, char hold[200], int s){
  int i = 0;
  if(follow == NULL){
    return;
  }

  for(i = 0; i < 26; i++){
    if(follow->child[i] == NULL){
      continue;
    }
    else{
      printf("%c",'a'+i);
      hold[s] = 'a'+i;
      s++;
      if(follow->child[i]->end == 1){
        printf("\n");
        hold[s] = '\0';
        printf("%s", hold);
      }
      preorder(follow->child[i], hold, s);
    }
  }
  return;
}

The words I've inserted are: boo, book, booking, john, tex, text. They should be printed in that order and line separated. My output looks like:

boo
book
booking
bookingjohn
bjohntex
bjtext
bjtext

I know this probably has something to do with my "hold" array, which stores the prefixes of words so they don't get lost. I need to set the index back to zero somewhere to indicate the completion of a prefix and all its associated words (boo, book, booking being a good example) but haven't been successful. Any help would be much appreciated and I would be glad to further clarify my thought process.

rici
  • 234,347
  • 28
  • 237
  • 341
Sara
  • 121
  • 7
  • how familiar are you with the concept of static variables? since this function is recursive, it might be good to use a static variable or two to keep track of information that should be remembered between function calls. – Joel Trauger Oct 10 '15 at 21:30
  • Not very familiar. I'm new to c. Please elaborate. – Sara Oct 10 '15 at 21:33
  • I'm think rather than use `s++`, you might need to pass `s+1` to the recursive call to `preorder`. That's usually how you'd handle descending, while allowing you to backup as you return. You'd also have to null terminate at `s+1` before printing. – Daniel Stevens Oct 10 '15 at 22:20
  • @Sara A static variable retains its value between function calls. The next time you call the function the variable will keep the value it was the last time it was assigned a value. – Joel Trauger Oct 10 '15 at 23:24
  • @JoelTrauger thank you, I'll keep that in mind. – Sara Oct 10 '15 at 23:39

1 Answers1

2

You're very close.

There are two problems, both in the for loop which runs through the trie branches:

else{
  printf("%c",'a'+i);
  hold[s] = 'a'+i;
  s++;

First problem is that you are printing (almost) everything twice. In the above snippet, you print the prefix as you trace the tree. Then later when you reach the end of the word, you print the entire word:

  if(follow->child[i]->end == 1){
    printf("\n");
    hold[s] = '\0';
    printf("%s", hold);
  }

So there was no need to print the prefix at all, and the double printing is confusing.

Second, the s argument represents the depth in the tree, which is the length of the current prefix. So it should be constant during the exploration of a trie node. But everytime you find a new branch, you increment it (s++ in the first snippet above). Instead of doing that, you need the recursive call to use s + 1 as its argument, so that it will be called with the correct prefix-length.

You can also simplify your control structures quite a bit.

Here's an example:

void preorder(struct node *follow, char hold[200], int s){
  int i = 0;
  if(follow == NULL){
    return;
  }
  /* Print the word at the beginning instead of the end */
  if (follow->end) {
    hold[s] = 0;
    printf("%s\n", hold);
  }

  for(i = 0; i < 26; i++){
    /* preorder returns immediately if its argument is NULL, so
     * there's no need to check twice. Perhaps even better would be
     * to do the check here, and not do it at the beginning.
     */
    hold[s] = 'a'+i;
    preorder(follow->child[i], hold, s + 1);
  }
}
rici
  • 234,347
  • 28
  • 237
  • 341
  • I came up with very nearly the exact same solution. I had `hold[s] = 0` though, rather than `s+1`. Are you sure about the index? – Daniel Stevens Oct 10 '15 at 22:48
  • @DanielStevens: Nope, you're right. Moving it up one level in the call tree means it should be `s`. Thanks. – rici Oct 10 '15 at 22:52
  • Thank you so much. I felt like my code was getting too over complicated and a day of working through it made me blind to simplifications. Your explanation makes perfect sense. – Sara Oct 10 '15 at 23:11