0

I am a hobbyist programmer trying find the appropriate place to put unique_ptr in my binary tree. Originally, I used unique_ptr for the left and right children, but that "meant" that each node "owned" each subsequent node. I have been told in a previous post that the tree should own its nodes. This is my solution to the problem: all of the trees nodes are stored in a vector of unique pointers and unique_ptr::get is used to extract the raw pointers which are used in the usual manner (example add).

#include "pch.h"
#include <iostream>
#include <sstream>
#include <string>
#include <memory>
#include <vector>
#include <unordered_map>

class Node
{
public:
    Node(int data = 0) : data_(data), left_child(nullptr), 
        right_child(nullptr) {}

    int data_;
    Node *left_child;
    Node *right_child;
};

class Tree
{
public:
    Tree(int data = 0) {
        store.emplace_back(std::make_unique<Node>(data));
        root = store.at(0).get();
    }
    void add(int data) {
        Node *index = root;
        while (true) {
            if (data == index->data_) {
                std::stringstream ss;
                ss << data << " already exists\n";
                throw std::invalid_argument(ss.str());
            }
            if (data < index->data_) {
                if (index->left_child != nullptr) {
                    index = index->left_child;
                    continue;
                }
                std::unique_ptr<Node> temp = std::make_unique<Node>(data);
                index->left_child = temp.get();
                store.push_back(std::move(temp));
                return;
            }
            if (index->right_child != nullptr) {
                index = index->right_child;
                continue;
            }
            std::unique_ptr<Node> temp = std::make_unique<Node>(data);
            index->right_child = temp.get();
            store.push_back(std::move(temp));
            return;
        }
    }
private:
    std::vector<std::unique_ptr<Node>> store;
    Node* root;
};

Removing a node seems like it will be painfully slow. Find the value in the tree (fast), find the pointer in the std::vector (slow), remove the entry from the vector, and finally trim the pointer from the parent. Am I on the right track? If not, hints would be welcome.

davidbear
  • 375
  • 2
  • 13
  • 2
    I came to the conclusion that using smart pointers to implement basic container types is likely as much trouble as it is worth. The container itself is a RAII object and it is probably simpler to write a cleanup constructor than figure out if your nodes contain pointers to nodes or pointers to smart pointers to nodes... or whatever. – Galik Mar 19 '19 at 05:32
  • When people tell you that the tree should own the nodes, they usually mean what Galik suggested. While the `vector`'s a good way to make sure the nodes are all freed when the tree is gone, you're no better off in the general case of freeing a node without destroying the tree. If you miss a case, or have a case where you forget to erase, you are almost back to where you started with detritus building up in the `vector`. You might as well manually `delete`. – user4581301 Mar 19 '19 at 05:40

2 Answers2

1

Using std::unique_ptr for the children is the quick-and-dirty solution, but it does not match the requirement of the question. Putting the pointers into a vector is a bad idea due to the convoluted code, and time complexity involved.

The quick-and-dirty solution is to write a function in the tree, that will recursively delete the nodes. The disadvantage is a potential stack-overflow if the tree is unbalanced (just like with std::unique_ptr on the children). There are several ways to combat the potential stack-overflow:

The efficient solution, that does not have the stack-overflow, neither the potential of std::bad_alloc exception. It is a DFS algorithm, using the freed tree nodes as the stack. The Node::left entry will point to the payload (the subtree to be freed), and Node::right will have the role of next in the DFS stack (a linked list).

static Node * & nextNode(Node & node)
{ return node.right_child; }
static Node * & payload(Node & node)
{ return node.left_child; } 

Tree::~Tree()
{
    Node temp;
    Node *stack = & temp;
    payload(*stack) = root;
    nextNode(*stack) = nullptr;
    constexpr bool TRACE = false;

    while (stack) {
        Node * treeNode = payload(*stack);
        Node * toPush1 = treeNode->left_child;
        Node * toPush2 = treeNode->right_child;
        if (toPush1) {
            payload(*stack) = toPush1;
            if (toPush2) {
                payload(*treeNode) = toPush2;
                nextNode(*treeNode) = stack;
                stack = treeNode;
            } else {
                if (TRACE) std::cout << treeNode->data_ << " ";
                delete treeNode;
            }
        }
        else if (toPush2) {
            payload(*stack) = toPush2;
            if (TRACE) std::cout << treeNode->data_ << " ";
            delete treeNode;
        }
        else { // nothing to push 
            Node *nextStack = nextNode(*stack);
            if (TRACE) std::cout << treeNode->data_ << " ";
            delete treeNode;
            if (stack != &temp) {
                if (TRACE) std::cout << stack->data_ << " ";
                delete stack;
            }
            stack = nextStack;
        }
    }
    // free the stack.
    while (stack) {
        Node *nextStack = nextNode(*stack);
        if (stack != &temp) {
            if (TRACE) std::cout << stack->data_ << " ";
            delete stack;
        }
        stack = nextStack;
    }
    if (TRACE) std::cout << '\n';
}

This will get you both memory efficient, with O(1) additional memory, and time efficient, with O(N) time complexity.

For completeness, here is the rest of the Tree class:

class Tree
{
public:
    Tree(int data = 0) {
        root = new Node(data);
    }
    ~Tree();
    Copy ctor, assignment, move ctor, move assignment

    void add(int data) {
        Node *index = root;
        while (true) {
            if (data == index->data_) {
                std::stringstream ss;
                ss << data << " already exists\n";
                throw std::invalid_argument(ss.str());
            }
            if (data < index->data_) {
                if (index->left_child != nullptr) {
                    index = index->left_child;
                    continue;
                }
                std::unique_ptr<Node> temp = std::make_unique<Node>(data);
                index->left_child = temp.release();

                return;
            }
            if (index->right_child != nullptr) {
                index = index->right_child;
                continue;
            }
            std::unique_ptr<Node> temp = std::make_unique<Node>(data);
            index->right_child = temp.release();

            return;
        }
    }
private:
    // owning the root and all descendants recursively
    Node* root;
};
Michael Veksler
  • 8,217
  • 1
  • 20
  • 33
  • Sorry to be so thick, but what does the rest of `Tree` look like? How is ownership established? How are child nodes added to the root? – davidbear Mar 21 '19 at 23:07
  • @davidbear added more of Tree to the answer. Ownership is implied in the Tree. Ownership goes recursively from root. – Michael Veksler Mar 22 '19 at 00:59
  • So bottom line is that `std::unique_ptr` just cannot be effectively used in a binary tree structure where the nodes are "owned" by the tree? Given all the talk that there is no "good reason" to ever use `new`, this surprises me. – davidbear Mar 22 '19 at 02:55
  • @davidbear not quite. The recommended approach is to have unique_ptr for children, but your request was to avoid that. After all, a subtree owns the left and right subtrees. If you insist you can rename Node to Subtree. Also, the stack overflow is non-issue for self balancing trees since even for a billion nodes it is not even 100 levels deep. But if one insists on unbalanced trees, then freening is better done iteratively, on top of unique_ptr – Michael Veksler Mar 22 '19 at 07:20
  • the only reason for my request to avoid `std::unique_ptr` for children pointers was because I was told that that was a poor ownership model by someone on stackoverflow. My code was to solve the problem of wanting to use `std::unique_ptr` and get the ownership model "correct." I appreciate this discourse. Thanks – davidbear Mar 22 '19 at 15:30
0

Of course it's a bad idea to have an vector of all the allocated nodes in your Tree class. As you pointed-out, manipulating it to find and erase a Node is slow (i.e. depends linearly on your tree size), and diminishes all the advantages your tree is supposed to have.

A first suggestion would be using std::unique_ptr<Node> for your left_child and right_child members of your Node. Then you'll not need the vector to store all the nodes.

But in your specific implementation is not enough: you do nothing to ballance the tree, hence in worst-case its depth will grow linearly, hence the recursive cleanup will fail (stack overflow). You'll need a hybrid iterative-recursive cleanup.

But you can do this as a next step. First step - just get rid of the vector.

Michael Veksler
  • 8,217
  • 1
  • 20
  • 33
valdo
  • 12,632
  • 2
  • 37
  • 67
  • As I said, my first crack at this problem used `unique_ptr` for both left and right children, but I was told that `unique_ptr` conveyed "ownership." And it was suggested that the tree should do the owning. This specific implementation is just some example code using this ownership model. An AVL-type tree and proper destructors would probably be used in an actual implementation. Thank you for your thoughts. – davidbear Mar 19 '19 at 05:41
  • 2
    Ok, then don't use `unique_ptr`. Instead you'd need to implement the recursive deletion of your `Node` in the `Tree` class. But in either case using `vector` to store all the nodes just to "own" them is the worst idea. – valdo Mar 19 '19 at 05:48