-1

I'm trying to write a function that removes redundant items in a binary search tree. I am completely stuck. Can anyone help me? Any help would be appreciated. (C program)

ruka1
  • 1
  • 1
  • Please take the [tour] and read [ask] – klutt Mar 09 '21 at 13:01
  • Your problem is obviously either in finding duplicates or removing nodes. Start finding out which of these it is. – klutt Mar 09 '21 at 13:02
  • @klutt I'm having difficulty iterating through the tree and finding duplicates. I'm not sure how to iterate through the tree and be able to modify it at the same time. – ruka1 Mar 09 '21 at 13:04
  • You should post your code and an example with duplicates – Jack Lilhammers Mar 09 '21 at 13:07
  • Surely the solution is not to insert a duplicate item? – Weather Vane Mar 09 '21 at 13:12
  • Write a routine that deletes a node of the tree. Write an in-order traversal of the tree. While traversing the tree, remember the previous node. Whenever the previous node and the current node have the same key, call the routine to delete the previous node. – Eric Postpischil Mar 09 '21 at 13:12
  • I think it's easier said than done. You could have to rebalance the tree, which could mess the traversal. Also, if you have `O(n)` duplicates in your tree, it could even be better to just rebuild a new tree. – Jack Lilhammers Mar 09 '21 at 13:15

1 Answers1

0

As @Jack Lilhammers pointed out in comments, it would be nice to create new empty tree and insert it nodes from the original tree... You can do it in this way.

  1. Create an empty binary search tree.
  2. Extract the root node of original bst and insert it to T if it doesn't exist in new tree
  3. Delete root node of your original bst
  4. Do step 2-3 recursively until there are no nodes in the original tree

Let's implement needed procedures to create a complete working program. First Include necessary libraries to work

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

Node structure will be

typedef struct node{
 int key;
 struct node *left, *right, *parent;
}BST;

First we need to implement create_empty_bst procedure

BST create_empty_bst(void){
  BST* root = NULL;
  return root;
}

To insert extracted root node, we need insert function. Its return value can both be void or BST which completely depends on us. If we would use void return type, then the function needs to take pointer to pointer to the root of the tree (aka. double pointer); otherwise, we can pass pointer to root of the tree, and return root of the tree. you can do either way. I will use the second one

BST*new_Node(int data){
  BST* node = (BST*)malloc(sizeof(struct node));
  assert(node != NULL); // or you can write if(node == NULL) printf("error"); exit(0);
  node->key = data;
  node->left = node->right = node->parent = NULL; // node is inserted always at leaf. so pointers are NULL
  return node;
}

void insert(BST** root, int x) {
    BST* node = new_Node(x);
    BST* p = NULL, * y = *root;
    while (y != NULL) {
        p = y;
        if (y->key == x)
            return;
        if (y->key > node->key)
            y = y->left;
        else
            y = y->right;
    }
    node->parent = p;
    if (p == NULL)
        *root = node;
    else if (p->key > node->key)
        p->left = node;
    else
       p->right = node;
}

New_Node(x) creates node with appropriate attributes. Only addition made to insert function is:

if(y->key == x)
   return;

when y->key == x, we encountered with a duplicate value. We don't go any further and terminate the procedure. By using this if, we prevent duplicate values to be inserted into the new tree.

The most complex and time-consuming part of the program is the DELETE procedure. We need to implement 4 other procedures to make deletion complete. Those 4 procedures are: SEARCH(BST*root, int x), SUCCESSOR(BST* node), find_min(BST* node), and TRANSPLANT(BST* root, BST* u, BST* v). If you have difficulty in understand DELETE, you can look at Binary Search Tree (Chapter 12) in CLRS. I get this DELETE from this book. Those 5 procedures are implemented as follows:

BST* TRANSPLANT(BST* root, BST* u, BST* v) {
    if (u->parent == NULL)
        root = v;
    else if (u->parent->left == u)
        u->parent->left = v;
    else
        u->parent->right = v;
    if (v != NULL)
        v->parent = u->parent;
    return root;
}

BST* find_min(BST* node) {
    if (node == NULL)
        return node;
    while (node->left)
        node = node->left;
    return node;
}

find_min(node) finds the smallest element on the subtree rooted at node node, if there's any.

BST* SUCCESSOR(BST* node) {
    if (node == NULL)
        return node;
    if (node->right != NULL)
        return find_min(node->right);

    while (node->parent != NULL && node->parent->right == node)
        node = node->parent;
    return node->parent;
}

SUCCESSOR(node) finds the next larger element greater than node->key on the subtree rooted at node node, if there's any.

BST* SEARCH(BST* N, int x) {
    if (N == NULL || N->key == x)
        return N;
    else if (N->key > x)
        return SEARCH(N->left, x);
    else
        return SEARCH(N->right, x);
}

BST* DELETE(BST* root, int x) {
    BST* node = SEARCH(root, x);
    assert(node != NULL);
    BST* bosh = root;
    if (node->right == NULL) {
        root = TRANSPLANT(bosh, node, node->left);
    }
    else if (node->left == NULL) {
        root = TRANSPLANT(bosh, node, node->right);
    }
    else {
        BST* temp = SUCCESSOR(node);
        if (node->right != temp) {
            root = TRANSPLANT(root, temp, temp->right);
            temp->right = node->right;
            temp->right->parent = temp;

        }
        root = TRANSPLANT(root, node, temp);
        temp->left = node->left;
        temp->left->parent = temp;
    }
    return root;
}

We have all to implement DELETE_DUPLICATES(BST* new_tree, BST* original) procedure.

BST* DELETE_DUPLICATES(BST* new_tree, BST* original) {
    while (original) {
        insert(&new_tree, original->key);
        original = DELETE(original, original->key);
    }
    return new_tree;
}

We are done. To test, I'll write whole program so you can see all procedures easily.

Below is the program as whole, rather than as separate procedures implemented above.

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>

typedef struct node {
    int key;
    struct node* left, * right, * parent;
}BST;

void preorder(BST*);
BST* DELETE_DUPLICATES(BST*, BST*);
void display_tree(BST*);
BST* create_empty_bst(void);
BST* new_Node(int);
void insert(BST**, int);
BST* insertSimple(BST*, int);
BST* TRANSPLANT(BST*, BST*, BST*);
BST* SUCCESSOR(BST*);
BST* find_min(BST* );
BST* SEARCH(BST*, int);
BST* DELETE(BST*, int);

BST* create_empty_bst(void) {
    BST* root = NULL;
    return root;
}

BST* new_Node(int data) {
    BST* node = (BST*)malloc(sizeof(struct node));
    assert(node != NULL); // or you can write if(node == NULL) printf("error"); exit(0);
    node->key = data;
    node->left = node->right = node->parent = NULL; // node is inserted always at leaf. so pointers are NULL
    return node;
}

void insert(BST** root, int x) {
    BST* node = new_Node(x);
    BST* p = NULL, * y = *root;
    while (y != NULL) {
        p = y;
        if (y->key == x)
            return;
        if (y->key > node->key)
            y = y->left;
        else
            y = y->right;
    }
    node->parent = p;
    if (p == NULL)
        *root = node;
    else if (p->key > node->key)
        p->left = node;
    else
        p->right = node;
}


BST* insertSimple(BST* root, int x) {
    BST* node = new_Node(x);
    BST* p = NULL, * y = root;
    while (y != NULL) {
        p = y;
        if (y->key > node->key)
            y = y->left;
        else
            y = y->right;
    }
    node->parent = p;
    if (p == NULL)
        root = node;
    else if (p->key > node->key)
        p->left = node;
    else
        p->right = node;
    return root;
}

BST* TRANSPLANT(BST* root, BST* u, BST* v) {
    if (u->parent == NULL)
        root = v;
    else if (u->parent->left == u)
        u->parent->left = v;
    else
        u->parent->right = v;
    if (v != NULL)
        v->parent = u->parent;
    return root;
}

BST* find_min(BST* node) {
    if (node == NULL)
        return node;
    while (node->left)
        node = node->left;
    return node;
}

BST* SUCCESSOR(BST* node) {
    if (node == NULL)
        return node;
    if (node->right != NULL)
        return find_min(node->right);

    while (node->parent != NULL && node->parent->right == node)
        node = node->parent;
    return node->parent;
}

BST* SEARCH(BST* N, int x) {
    if (N == NULL || N->key == x)
        return N;
    else if (N->key > x)
        return SEARCH(N->left, x);
    else
        return SEARCH(N->right, x);
}

BST* DELETE(BST* root, int x) {
    BST* node = SEARCH(root, x);
    assert(node != NULL);
    BST* bosh = root;
    if (node->right == NULL) {
        root = TRANSPLANT(bosh, node, node->left);
    }
    else if (node->left == NULL) {
        root = TRANSPLANT(bosh, node, node->right);
    }
    else {
        BST* temp = SUCCESSOR(node);
        if (node->right != temp) {
            root = TRANSPLANT(root, temp, temp->right);
            temp->right = node->right;
            temp->right->parent = temp;

        }
        root = TRANSPLANT(root, node, temp);
        temp->left = node->left;
        temp->left->parent = temp;
    }
    return root;
}

void preorder(BST* root) {
    if (root) {
        printf("%d  ", root->key);
        preorder(root->left);
        preorder(root->right);
    }
}

void display_tree(BST* root) {
    preorder(root);
    printf("\n");
}


BST* DELETE_DUPLICATES(BST* new_tree, BST* original) {
    while (original) {
        insert(&new_tree, original->key);
        original = DELETE(original, original->key);
    }
    return new_tree;
}


int main(void) {
    BST* new_tree = NULL, * original = NUL;
    original = insertSimple(original, 20); original = insertSimple(original, 20);
    original = insertSimple(original, 12);
    original = insertSimple(original, 30);
    original = insertSimple(original, 18);
    original = insertSimple(original, 12);
    original = insertSimple(original, 30);
    original = insertSimple(original, 11);
    display_tree(original);
    display_tree(new_tree);

    new_tree = DELETE_DUPLICATES(new_tree, original);
    display_tree(new_tree);
    return 0;
}

NOTES: insertSimple function inserts elements into the original tree. We can't use insert procedure because it doesn't insert duplicate elements. insertSimple procedure is written for that purpose. preorder procedure prints the elements in the preorder form. (If have difficulty, look at preorder traversal). That's it, I wish you success in your work.

Alparslan
  • 35
  • 6