2

I am writing an Octree based container from anywhere between 10 points to 1 billion points into memory.. Because of the amount of data being loaded, I need to be watchful of the memory consumption.

Everything seems to be working properly and segmented as needed, however the insertion time is incredibly slow. Probably because of the redistribution of data between parent to children. Is there anything I can do to optimize this? Have I implemented this properly? I could reserve the vector in each node to include the max amount of points, but this would spike the required memory significantly.

Using a simple R-tree type container, I am loading 468 million points in about 48 seconds.. using the octree below, I am loading in 245 seconds.

    class OctreeNode {
    public:
        std::vector<std::shared_ptr<OctreeNode>>    Children;
        std::vector<TPoint> Data;
        BoundingBox         Bounds;

        OctreeNode(){}

        OctreeNode(BoundingBox bounds) : Bounds(bounds){
        }

        ~OctreeNode(void){}

        void Split();

    };

    typedef std::shared_ptr<OctreeNode> OctreeNodePtr;


    void OctreeNode::Split()
    {
        Point box[8];
        Bounds.Get8Corners(box);
        Point center = Bounds.Center;

        Children.reserve(8);
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[0], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[1], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[3], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[2], center))));


        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[5], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[4], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[6], center))));
        Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[7], center))));
    }



    Octree::Octree(BoundingBox bounds) : Bounds(bounds)
    {
        _root = OctreeNodePtr(new OctreeNode(bounds));
        _root->Split();
    }


    Octree::~Octree()
    {
    }



    bool Octree::InsertPoint(TPoint &p)
    {
        return InsertPoint(p, _root);
    }

    bool Octree::InsertPoint(TPoint &p, const OctreeNodePtr &parent)
    {
        if (parent->Children.size() != 0){
            for (size_t i = 0; i < parent->Children.size(); i++){
                OctreeNodePtr &currentNode = parent->Children[i];
                if (currentNode->Bounds.IsContained(p.ToPoint3d())){
                    return InsertPoint(p, currentNode);
                }           
            }

            // Was not able to insert a point.
            return false;
        }

        BoundingBox newBounds = parent->Bounds;
        newBounds.Extend(p.ToPoint3d());


        // Check for split condition...
        if (parent->Data.size() == MaxPerNode && newBounds.XLength() > 0.01){

            // Split it...thus generating children nodes
            parent->Split();


            // Resize the children arrays so that we don't have to keep allocating when redistributing points..
            for (size_t i = 0; i < parent->Children.size(); i++){
                parent->Children[i]->Data.reserve(parent->Data.size());
            }


            // Distribute the points that were in the parent to its children..
            for (size_t i = 0; i < parent->Data.size(); i++){
                TPoint originalPoint = parent->Data[i];
                if (!InsertPoint(originalPoint, parent)){
                    printf("Failed to insert point\n");
                    break;
                }
            }

            // Insert the current point.
            if (!InsertPoint(p, parent)){
                printf("Failed to insert point\n");
            }


            // Resize the arrays back so it fits the size of the data.....
            for (size_t i = 0; i < parent->Children.size(); i++){
                parent->Children[i]->Data.shrink_to_fit();
            }

            // clear out the parent information
            parent->Data.clear();
            parent->Data.shrink_to_fit();
            return true;
        } else {
            // Node is valid so insert the data..
            if (parent->Data.size() <= 100000){
                parent->Data.push_back(p);
            } else {
                printf("Too much data in tiny node... Stop adding\n");
            }

            return true;
        }


    }


    void Octree::Compress(){
        Compress(_root);
    }

    void Octree::Compress(const OctreeNodePtr &parent){


        if (parent->Children.size() > 0){

            // Look for and remove useless cells who do not contain children or point cloud data.
            size_t j = 0;
            bool removed = false;
            while (j < parent->Children.size()){
                if (parent->Children[j]->Children.size() == 0 && parent->Children[j]->Data.size() == 0){
                    parent->Children.erase(parent->Children.begin() + j);
                    removed = true;
                } else {
                    Compress(parent->Children[j]);
                    ++j;
                }
            }

            if (removed)
                parent->Children.shrink_to_fit();

            return;
        }

        parent->Data.shrink_to_fit();
    }
user1000247
  • 143
  • 1
  • 9
  • 2
    I'm not sure what your problem really is, but I can offer a few pointers. I'm not sure you need to be calling `shrink_to_fit` so much. In fact, try not calling it altogether. Additionally, with a `vector` that holds pointer types, it is faster to call `resize()` and then use assignment at indices rather than `reserve` coupled with `push_back` when you know you'll be inserting a few elements. If you're using C++11 or higher, you can use `emplace_back` when you're constructing the element in place as you insert it into the container, which should avoid a copy/move. – AndyG Mar 18 '16 at 14:25
  • @AndyG Thanks for the tips, I do not think the reserve + push_back on the nodes is the slowdown. Without the shrink_to_fit on the children, each child node would allocate up to 25000 points even if only one was added. I am not sure I understand what emplace_back does, but I had changed the data.push_back to emplace_back and load time increased by one second. I have removed the shrink_to_fit on the children, these decreased load time by 10 seconds at the trade off of requiring more memory until the Compress method is called after creation. – user1000247 Mar 18 '16 at 15:22
  • 1
    Hmmm, `emplace_back` shouldn't cause an increase in load time. Can you show how you used it? Regardless, the difference between an `emplace_back` and a `push_back` should be pretty minimal unless you're getting into the millions. – AndyG Mar 18 '16 at 15:24
  • All I did was changed if (parent->Data.size() <= 100000){ parent->Data.push_back(p); to if (parent->Data.size() <= 100000){ parent->Data.emplace_back(p); – user1000247 Mar 18 '16 at 15:39
  • 1
    Sorry, in that scenario, `emplace_back` won't give you any benefit. You could get away with using `resize` and assignment in your `Split` function, however. – AndyG Mar 18 '16 at 17:14
  • Your code is incomplete; in particular, it seems to be missing a `main()` function and at least one `#include`. Please [edit] your code so it's a [mcve] of your problem, then we can try to reproduce and solve it. You should also read [ask]. – Toby Speight Jan 18 '18 at 10:53

2 Answers2

1

Just a small thing, but replacing this:

Children.push_back(OctreeNodePtr(new OctreeNode(BoundingBox::From(box[0], center))));

with this:

Children.push_back(std::make_shared<OctreeNode>(BoundingBox::From(box[0], center)));

will reduce loading times a little and reduce memory consumption.

This is true for any shared_ptr. the make_shared<> route amalgamates the control block with the shared object.

Richard Hodges
  • 68,278
  • 7
  • 90
  • 142
0

What I see here is that to insert points, you iterate through the 8 children and check for each if the point is inside.

The main advantage of an octree is that depending on the bounding box center and you data's position, you are able to calculate the index without iterating over your childs. This is called an octant (for an octree).

You can find here a simple implementation of this. https://github.com/brandonpelfrey/SimpleOctree/blob/master/Octree.h#L79 Look for the function getOctantContainingPoint(...).

I used this code as a base to implement mine, and the way of calculating the index can be heavily sped up (using SIMD instructions...).

You are also concerned by memory consumption. In order to reduce memory consumption it can be faster and lighter to recalculate your node's bounding boxes during the descent.

Seb Maire
  • 132
  • 8