0

I am studying Dynamic Programming Algorithms for optimizing Binary Search Tree in C++ language. I have built my own program but I do not know whether my program finds out the correct answer or not. I have made an attempt to find sample code on the internet but I just found sample one for Successful search, therefore, I do not know the correct answer. More than that, I think I have a mistake in the way I code but I am not able to point it out.

If you do not understand the problem, you can read here Optimal Binary Search Tree

Brief description: This is a problem that builds an optimal Binary search Tree. The problem is given two sets to record the probability of found and unfound objects in a binary search tree. From that given data, I need to calculate the minimum cost of searching an arbitrary object in the binary search tree

Below is my source code:

double OptimalBinarySearchTree(double Found[], double Unfound[], int n)
{
    double Cost[n + 2][n + 1], Freq[n + 2][n + 1];
    int i, j, k, l;
    double temp = 0;
    memset(Cost, 0, sizeof(Cost));
    memset(Freq, 0, sizeof(Freq));
    for (i = 1; i <= n; i++)
    {
        Cost[i][i - 1] = Unfound[i - 1];
        Freq[i][i - 1] = Unfound[i - 1];
    }
    for (l = 1; l <= n; l++)
    {
        for (i = 1; i <= n - l + 1; i++)
        {
            j = l + i - 1;
            Freq[i][j] = Freq[i][j - 1] + Found[j] + Unfound[j];
            Cost[i][j] = INT32_MAX;
            for (k = i; k <= j; k++)
            {
                temp = 0;
                if (k > i)
                    temp += Cost[i][k - 1];
                if (k < j)
                    temp += Cost[k + 1][j];
                temp += Freq[i][j];
                if (temp < Cost[i][j])
                    Cost[i][j] = temp;
            }
        }
    }
    return Cost[1][n];
}

For example, when I run my program with

    double Found[7] = {0, 0.15, 0.10, 0.05, 0.10, 0.20};
    double Unfound[7] = {0.05, 0.10, 0.05, 0.05, 0.05, 0.10};

My program returns the value is 2.45 but maybe the "real" answer is 2.85. I do not know where I get wrong with my algorithms. I really need someone to check the correctness of my program or algorithm. I really appreciate it if you can point it out for me.

Hoang Nam
  • 548
  • 3
  • 18
  • 1
    It's not very clear what that function is supposed to do. – molbdnilo Jul 17 '20 at 13:23
  • Can you also explain why you want to use dynamic programming to search a value in a BST ? It seems quite exotic. – m.raynal Jul 17 '20 at 13:40
  • I am sorry but maybe you all have misunderstood my question. This is a problem that builds an optimal Binary search Tree. The problem is given two sets to record the probability of found and unfound objects in a binary search tree. From that given data, I need to calculate the minimum cost of searching an arbitrary object in the binary search tree. To understand more, you can search this problem on the internet. – Hoang Nam Jul 17 '20 at 13:58
  • @molbdnilo I think you should read the problem first. It is DP so if you do not understand its function, it is impossible for you to find the mistake here. If there is something confusing such as variables or parameters, please ask me and I will answer instantly. – Hoang Nam Jul 17 '20 at 14:01
  • @HoangNam The only problem description you provided was “optimizing binary search tree”. It’s not at all obvious how any of this relates to that problem. Please add the description to the question, not in the comments and not as a link. – molbdnilo Jul 17 '20 at 14:07
  • @molbdnilo Do you seriously read the information in the link I gave? In that link, it has a full description of this problem and even the algorithm to solve this problem – Hoang Nam Jul 17 '20 at 14:15
  • Your algorithm is pretty close to the one here: https://stackoverflow.com/questions/16987670/dynamic-programming-why-knuths-improvement-to-optimal-binary-search-tree-on2, maybe you could check if they lead to the same result – PhM75 Jul 17 '20 at 17:29
  • @PhM75 Thanks for your comment. I observed that I have used the same algorithms here but my program leads to the wrong result. It's quite confusing here. Can you help me to check it? – Hoang Nam Jul 18 '20 at 01:16

2 Answers2

1

From what I can see the 2 algorithms differ when calculating the cost of the new candidate sub-root E_{i,j} = E_{i,r-1} + E_{r+1,j} + W_{i,j} Your code is not adding the left sub-tree value when k = 1 and not adding the right sub-tree value when k=j.

        temp = 0;
        if (k > i)
            temp += Cost[i][k - 1];
        if (k < j)
            temp += Cost[k + 1][j];
        temp += Freq[i][j];
        if (temp < Cost[i][j])
            Cost[i][j] = temp;

Is there any reason why you have a specific implementation of the recurence for these 2 cases? If no, which sounds to be the case in the other implementation of the DP algorithm, or in the link you provided, the recurrence should be:

        temp = Cost[i][k - 1] + Cost[k + 1][j] + Freq[i][j];
        if (temp < Cost[i][j])
            Cost[i][j] = temp;
PhM75
  • 90
  • 5
  • Oh, I really thank you. The reason for this mistake is that I have considered the cases that it has no left or right subtree. But when I re-read my source code, I found that I have change **i** to run from 1, not from 0, so it is unnecessary. – Hoang Nam Jul 18 '20 at 09:42
0

According to the algorithm

Algorithm OBST(p, q, n)
// e[1…n+1, 0…n ] : Optimal sub tree
// w[1…n+1,  0…n] : Sum of probability
// root[1…n, 1…n] : Used to construct OBST

for i ← 1 to n + 1 do
    e[i, i – 1] ← qi – 1
    w[i, i – 1] ← qi – 1
end

for m ← 1 to n do
    for i ← 1 to n – m + 1 do
        j ← i + m – 1 
        e[i, j] ← ∞
        w[i, j] ← w[i, j – 1] + pj + qj
        for r ← i to j do
            t ← e[i, r – 1] + e[r + 1, j] + w[i, j]
            if t < e[i, j] then
                e[i, j] ← t
                root[i, j] ← r
            end
        end
    end
end
return (e, root)

Initialization of the Cost (e) and Freq (w) should be done for 1 to n + 1. And as @PhM75 said There is no need for explicit verification of k > i and k < j So, the final code should be


    double OptimalBinarySearchTree(double Found[], double Unfound[], int n)
        {
        double Cost[n + 2][n + 1], Freq[n + 2][n + 1];
        int i, j, k, l;
        double temp = 0;
        memset(Cost, 0, sizeof(Cost));
        memset(Freq, 0, sizeof(Freq));
        for (i = 1; i <= n + 1; i++)
        {
            Cost[i][i - 1] = Unfound[i - 1];
            Freq[i][i - 1] = Unfound[i - 1];
        }
        for (l = 1; l <= n; l++)
        {
            for (i = 1; i <= n - l + 1; i++)
            {
                j = l + i - 1;
                Freq[i][j] = Freq[i][j - 1] + Found[j] + Unfound[j];
                Cost[i][j] = INT_MAX;
                for (k = i; k <= j; k++)
                {
                    temp = Cost[i][k - 1] + Cost[k + 1][j] + Freq[i][j];
                    if (temp < Cost[i][j])
                        Cost[i][j] = temp;
                }
            }
        }
        return Cost[1][n];
    }

NOTE: The use of memset is not necessary, but it is retained to preserve code similarity with the question.

And the final answer will be 2.75 not 2.85