0

I'm trying to create a balanced ternary tree. To do that I want to insert keys in an order that ensures the tree is built so that each node partitions the range of characters below it evenly. So ideally the first node will be 'M' and on either side will be 'F' and 'T' and so on down, each key being the mid-point of each partition.

I have the code below which accomplishes this, I could obviously precompute the index for all possible characters and save that but, but I'm wondering if someone has a better arithmetic or bit-twiddling approach to calculating the ordering rather than the lookup array. It doesn't need to balance exactly on 'M', any fair, binary recursive split that sorts midpoints first would do.

Edit: Yes, it would be better if I could partition the keys evenly rather than the alphabet but I add to this tree at run-time.

    /// <summary>
    /// A midpoint first comparison of two strings, 
    /// earliest sort = closest to middle, then middle of each side
    /// and so on down. Designed to create a balanced TernaryTree.
    /// </summary>
    private class MiddleOutComparer : IComparer<string>
    {
        public static readonly MiddleOutComparer Instance = new MiddleOutComparer();
        const string MidPointFirst = "MFTIPCWKRHUGEYBXNOJLQSDVAZ5271368490";
        private int Compare(string a, string b, int index)
        {
            if (index == a.Length && index == b.Length) return 0;
            if (index >= a.Length) return -1;
            if (index >= b.Length) return +1;
            int aValue = MidPointFirst.IndexOf(char.ToUpperInvariant(a[index]));
            int bValue = MidPointFirst.IndexOf(char.ToUpperInvariant(b[index]));
            if (aValue == bValue) return Compare(a, b, index + 1);
            return aValue.CompareTo(bValue);
        }
        public int Compare(string x, string y)
        {
            return Compare(x, y, 0);
        }
    }
Ian Mercer
  • 38,490
  • 8
  • 97
  • 133
  • 1
    MidPointFirst string does not have the character 'G', are you aware? – Selçuk Cihan Jan 12 '16 at 08:16
  • How does a `fair, binary recursive split` `ensure […] each node partitions the range […] evenly` in a ternary tree? That said, have a look at the least significant bit set in a range starting at 1 (e.g., least significant 5 bits of latin characters in Unicode/ASCII) (and reverse the relevant bit range, if that needed spelling out). – greybeard Jan 12 '16 at 09:20
  • @SelçukCihan thanks, added one. Any missing characters get -1 so it doesn't fail, but it would be better if it had 'G' and accented characters too. – Ian Mercer Jan 12 '16 at 16:03

0 Answers0