What is the fastest way to match string prefixes for the building of a Radix search tree?

Question

Full disclosure this is for an assignment in an algorithms class, but I'm not looking for someone to write any code for me or hold my hand. Anyways we have to do the Boggle game. For those unaware, basically you are given a Boggle board and have to find as many words as possible, which can take huge amounts of time given the string possibilities with recursion.

To bring down OP time, we were encouraged to utilize whatever algorithm we'd like to try and "weed out" strings as we go along. Currently I have a system that will find all possible 3 letter suffixes possible by storing them in a dictionary, and if the string has 3 letters but doesn't match any valid suffixes, recursion is stopped.

Getting to the crux of the question, I'm trying to implement a Radix suffix tree, and currently it looks something like this

public static class BoggleRadix
{
    private Node root;

    public BoggleRadix()
    {

    }

    public class Node
    {
        public Node()
        {
           List<Edge> nodeEdges = new List<Edge>() 
           {

           }
        }

    }

    public class Edge
    {
        public String edgeString;
        public Node edgeNode;

        public Edge()
        {

        }
        public Edge(String s, Edge e)
        {
            this.edgeString = s;
            this.edgeNode = e;
        }

    }

}

So I'm storing the pointers in an unordered list, which means if I'm checking for suffixes, I could expect, worse case, 26 operations to even find the suffix pointer I need to continue down the tree.

Is there any way to speed the insertions up, stream line the process? Is it worth building a small dictionary in memory for each node that would check for the first letter of the suffix's existence? Is this so trivial it wouldn't even matter?

Could you simply make an Edge[26] static array? or if you're worried about space, simply keep the array sorted (O(ln n) isn't too bad (4 lookups for a full table)? If you'll only be using alphanum chars, then a dict may be larger than the static array, depending on implementation. — Jim Keener, Sep 09 '14 at 12:55

score 1 · Answer 1 · answered Sep 09 '14 at 16:10

1

I agree with jimktrains's suggestion of using array instead of a linked list for the suffixes. You can convert a capital ASCII letter c to an array index with (int)c - (int)'A'. Tests on my system dictionary suggest that there are only a few thousand three-letter prefixes, so space usage should not be a concern.

On the other hand, you could store the whole dictionary as a deterministic acyclic finite state automaton. Then it might be wise to try to save space. Instead of binary search, which branches unpredictably, I would suggest a bitmap. In each node, have a field int bitmap; equal to the bitwise OR (the | operator, with identity element 0) of (1 << i) for each occupied index i in the uncompressed array. Then, to test the presence of an uncompressed index i, check whether bitmap & (1 << i) is nonzero. If so, then the index into the compressed array is Integer.bitCount(bitmap & ((1 << i) - 1)).

answered Sep 09 '14 at 16:10

David Eisenstat

64,237
7
60
120

As bitmap a BitSet would do nicely. – Joop Eggen Sep 09 '14 at 16:26
@JoopEggen That's a good class to know about, but here, it introduces the per-node space overhead of an object reference, an object, array metadata, and a length field, as well as the time overhead of two additional indirections (node -> BitSet -> backing array, unless JVMs have gotten a lot cleverer while I wasn't looking) and probably more than a few arithmetic instructions to compute indexes into the backing array and loop over them. – David Eisenstat Sep 09 '14 at 16:39
oh, I like your assembler near approach. BitSet is comfortable and has a succinct API. Personally I still would use BitSet for a fast algorithm. Still a late +1 is deserved. – Joop Eggen Sep 09 '14 at 20:07

What is the fastest way to match string prefixes for the building of a Radix search tree?

1 Answers1