I am building a Trie in Java. When searching the trie for a keyword, the entry for the keyword needs to also store which paragraphs the keyword appears in in the text. Does anyone have some insight into how I would go about storing the paragraph number in the trie with the word? Do I index the whole text and then put it into the trie? I'm a little stumped!
1 Answers
Usually a trie is a tree constructed by having some node type, that has a list of child nodes of the same type, where each child again has a list and so on. Now every node in the trie correspond to exactly one word and vice versa, so if you make an extra field in the node type you can store additional information, such as a paragraph number there.
In order to construct this, simply loop through every word and add it to the trie by walking down the trie and adding missing nodes, then mark the node corresponding to the word with the paragraph number. (not every node on the way to the word, only the last node)
Note that since a word may appear in several paragraphs, you probably want a list of paragraph numbers in each node. This way you can also have an empty list in the nodes for words which don't exist in the text.

- 3,574
- 1
- 18
- 37