1

Introduction:

I have successfully implemented Dual Contouring of Hermite Data by Ju et. al. Dual Contouring is an approach to contour volumetric data obtained by adaptively sampling a signed distance field using an octree.

The algorithm has two steps. The first part calculates intersection points and normals with the edges of the octree cells. It then proceeds to calculate a vertex for each leaf cell, which minimizes a quadratic error function.

Part two of the algorithm, builds a triangle mesh by connecting minimizing vertices with quads and triangles for leaf cells which share a minimal (max depth) edge. The authors use three recursive functions to reach these minimal edges.

procCell is called on each octree cell, and spawns: 8 calls to procCell for each of the node's children, 12 calls to procFace for each of the internal faces shared by any two adjacent child nodes, and 6 calls to procEdge for each internal edge shared by 4 child nodes.

procFace can spawn up to 4 calls to procFace when the nodes have children, and 4 calls to procEdge for each of the 4 internal edges of a face shared by 4 child nodes.

procEdge can spawn 2 calls to procEdge when the nodes have children along the same edge.

The recursion terminates when all nodes are leaf nodes, at which point it proceeds to connect minimizing vertices in the 4 adjoining cells with quads or triangles. These recursive functions have the effect of traversing the dual graph of the octree.

My rough implementation of the recursive dual octree traversal, stripped of unrelated code, can be found at the bottom of the post. Please note that the code is not optimized and needs some heavy refactoring.

The Ju et. al. approach has some problems that need to be addressed. Most of the Dual Contouring or Dual Marching Cubes algorithm variants, in the academic literature either produce non-manifold meshes, or create self-intersecting triangles.

For others who are interested in this topic, I have opted to use the method described in Isosurfaces Over Simplicial Partitions of Multiresolution Grids by Manson and Schaefer. This algorithm produces manifold meshes without self-intersection. It also has the benefit of efficiently representing sharp features (caused by discontinuities in the signed distance field) as well as thin features.

Question

The second technique uses the same recursive functions procCell, procFace, and procEdge as the original Dual Contouring paper. My final goal is to move most of the algorithm, out-of-core, onto the GPU and eventually onto a FPGA. I am using OpenCL to accomplish this.

In order to do this, I need to convert the algorithm into an interative algorithm. Even on the CPU, I am concerned about the recursion causing call stack overflow. My code needs to handle extremely large and deep volumetric data sets. The fact that the recursion is head-ended is especially a big problem.

How would one implement the same complex tree traversal using iteration?

Code for Ju. et. al.

DualRecursive.hpp

#pragma once

#include <functional>

#include <Math/Vec.hpp>

namespace arc {

    class DualTree;
    class IDualTreeNode;
    class DualTreeInternalNode;
    class DualTreeHeteroNode;
    class DualTreeHomoNode;
    class LocationCode;

    // const bool useHardNormalsFlag = true;

    // Vertex corner indices for each of the 12 cube eges.
    // Vertices are orientated -x to +x, -y to +y, and -z to +z.
    const int edgeVertexMap[12][2] = {{0, 1},
                                      {1, 2},
                                      {3, 2},
                                      {0, 3},
                                      {4, 5},
                                      {5, 6},
                                      {7, 6},
                                      {4, 7},
                                      {0, 4},
                                      {1, 5},
                                      {2, 6},
                                      {3, 7}};

    // Used to call `edgeProc` from `cellProc`.
    // 0-3 are the four child nodes which share one of the 6 internal edges of
    // parent. 4 is the direction.
    const int cellProcEdgeMask[6][5] = {{7, 4, 0, 3, 0},
                                        {6, 5, 1, 2, 1},
                                        {5, 1, 0, 4, 2},
                                        {6, 2, 3, 7, 3},
                                        {2, 3, 0, 1, 4},
                                        {6, 7, 4, 5, 5}};

    // Used to call `faceProc` from inside `cellProc`.
    // 0 is frist node index
    // 1 is second node index
    // 2 is direction
    const int cellProcFaceMask[12][3] = {{0, 1, 1},
                                         {1, 2, 3},
                                         {2, 3, 0},
                                         {3, 0, 2},
                                         {4, 5, 1},
                                         {5, 6, 3},
                                         {6, 7, 0},
                                         {7, 4, 2},
                                         {0, 4, 5},
                                         {1, 5, 5},
                                         {2, 6, 5},
                                         {3, 7, 5}};

    // Used to call `faceProc` from inside `faceProc`.
    // 0-7 are node index pairs for adjacent faces
    // 8 is direction
    const int faceProcFaceMask[6][9] = {{7, 6, 4, 5, 0, 1, 3, 2, 0},
                                        {6, 7, 5, 4, 1, 0, 2, 3, 1},
                                        {5, 6, 1, 2, 0, 3, 4, 7, 2},
                                        {6, 5, 2, 1, 3, 0, 7, 4, 3},
                                        {2, 6, 3, 7, 0, 4, 1, 5, 4},
                                        {6, 2, 7, 3, 4, 0, 5, 1, 5}};

    // Used to call `edgeProc` from inside `faceProc`.
    // 0-3 are node indices
    // 4-7 are side orders
    // 8 is direction
    const int faceProcEdgeMask[6][4][9] = {
        // Direction 0
        {{4, 0, 1, 5, 0, 0, 1, 1, 2},
         {3, 2, 1, 0, 0, 1, 1, 0, 4},
         {7, 3, 2, 6, 0, 0, 1, 1, 3},
         {7, 6, 5, 4, 0, 1, 1, 0, 5}},

        // Direction 1
        {{4, 0, 1, 5, 1, 1, 0, 0, 2},
         {3, 2, 1, 0, 1, 0, 0, 1, 4},
         {7, 3, 2, 6, 1, 1, 0, 0, 3},
         {7, 6, 5, 4, 1, 0, 0, 1, 5}},

        // Direction 2
        {{1, 0, 3, 2, 0, 0, 1, 1, 4},
         {4, 7, 3, 0, 0, 1, 1, 0, 0},
         {5, 4, 7, 6, 0, 0, 1, 1, 5},
         {5, 6, 2, 1, 0, 1, 1, 0, 1}},

        // Direction 3
        {{1, 0, 3, 2, 1, 1, 0, 0, 4},
         {4, 7, 3, 0, 1, 0, 0, 1, 0},
         {5, 4, 7, 6, 1, 1, 0, 0, 5},
         {5, 6, 2, 1, 1, 0, 0, 1, 1}},
        // Direction 4
        {{3, 0, 4, 7, 0, 0, 1, 1, 0},
         {1, 5, 4, 0, 0, 1, 1, 0, 2},
         {2, 1, 5, 6, 0, 0, 1, 1, 1},
         {2, 6, 7, 3, 0, 1, 1, 0, 3}},

        // Direction 5
        {{3, 0, 4, 7, 1, 1, 0, 0, 0},
         {1, 5, 4, 0, 1, 0, 0, 1, 2},
         {2, 1, 5, 6, 1, 1, 0, 0, 1},
         {2, 6, 7, 3, 1, 0, 0, 1, 3}}};

    // Used to call `edgeProc` from inside `edgeProc`.
    // 0-1 are child nodes along current edge of first node,
    // 2-3 are child nodes along current edge of second node,
    // etc.
    // 8 is the direction.
    const int edgeProcEdgeMask[6][9] = {
        {0, 1, 3, 2, 7, 6, 4, 5, 0},
        {0, 1, 3, 2, 7, 6, 4, 5, 1},
        {0, 3, 4, 7, 5, 6, 1, 2, 2},
        {0, 3, 4, 7, 5, 6, 1, 2, 3},
        {0, 4, 1, 5, 2, 6, 3, 7, 4},
        {0, 4, 1, 5, 2, 6, 3, 7, 5},
    };

    void buildTreeMesh( DualTree* dualTree,
                        std::function<float( Vec3f const& )>& distFunction,
                        bool drawEdges = false );

    void cellProc( IDualTreeNode* node, LocationCode code );
    void faceProc( int mask[2],
                   int dir,
                   INode* nodes[2],
                   LocationCode codes[2] );
    void edgeProc( int mask[4],
                   int dir,
                   INode* nodes[4],
                   LocationCode codes[4] );

}  // namespace arc

DualRecursive.cpp

#include <array>
#include <cassert>
#include <functional>
#include <vector>

#include "DualRecursive.hpp"

namespace arc {

    static std::function<float( Vec3f const& )> distF;

    // -- buildTreeMesh function --
    void buildTreeMesh( DualTree* dualTree,
                        std::function<float( Vec3f const& )>& distFunction,
                        bool drawEdges ) {
        distF = distFunction;

        dualTreeGlobal = dualTree;

        LocationCode a;

        cellProc( dualTree->rootNode, a );
    }

    // -- cellProc function --
    void cellProc( IDualTreeNode* node, LocationCode code ) {
        assert( node );
        if( !node ) {
            return;
        }

        auto t = node->getType();

        if( t != NodeType::Internal ) {
            return;
        }

        auto c = dynamic_cast<InternalNode*>( node )->getChildren();

        // Process 8 cells for children of this internal node.
        code.down();
        for( size_t i = 0; i < 8; i++ ) {
            cellProc( c[i], code );
            if( i < 7 ) {
                code.right();
            }
        }
        code.up();

        // Process 6 common edges shared by this internal node's children.
        code.down();
        for( int i = 0; i < 6; i++ ) {
            int m[4];
            m[0] = cellProcEdgeMask[i][0];
            m[1] = cellProcEdgeMask[i][1];
            m[2] = cellProcEdgeMask[i][2];
            m[3] = cellProcEdgeMask[i][3];

            INode* n[4];
            n[0] = c[m[0]];
            n[1] = c[m[1]];
            n[2] = c[m[2]];
            n[3] = c[m[3]];

            LocationCode codes[4] = {code, code, code, code};
            codes[0].setBack( m[0] );
            codes[1].setBack( m[1] );
            codes[2].setBack( m[2] );
            codes[3].setBack( m[3] );

            edgeProc( m, i, n, codes );
        }
        code.up();

        // Process 12 pairs of touching faces in this node's children.
        code.down();
        for( int i = 0; i < 12; i++ ) {
            int m[2];
            m[0] = cellProcFaceMask[i][0];
            m[1] = cellProcFaceMask[i][1];
            auto dir = cellProcFaceMask[i][2];

            INode* n[2];
            n[0] = c[m[0]];
            n[1] = c[m[1]];

            LocationCode codes[2] = {code, code};
            codes[0].setBack( m[0] );
            codes[1].setBack( m[1] );

            faceProc( m, dir, n, codes );
        }
        code.up();
    }

    // -- faceProc function --
    void faceProc( int mask[2],
                   int dir,
                   INode* nodes[2],
                   LocationCode codes[2] ) {
        NodeType t[2] = {nodes[0]->getType(), nodes[1]->getType()};

        if( t[0] != NodeType::Internal && t[1] != NodeType::Internal ) {
        } else {
            if( t[0] == NodeType::Homo || t[1] == NodeType::Homo ) {
                return;
            }
            // Process 4 pairs of adjacent faces which make up this face.
            for( size_t i = 0; i < 4; i++ ) {
                int m[2];
                INode* n[2];
                LocationCode c[2];
                m[0] = faceProcFaceMask[dir][i * 2];
                m[1] = faceProcFaceMask[dir][i * 2 + 1];
                auto dirFromMask = faceProcFaceMask[dir][8];
                assert( dirFromMask == dir );

                if( t[0] == NodeType::Internal ) {
                    auto internal0 = (InternalNode*)nodes[0];
                    n[0] = internal0->getChildAt( m[0] );
                    c[0] = codes[0];
                    c[0].down();
                    c[0].setBack( m[0] );
                } else {
                    n[0] = nodes[0];
                    c[0] = codes[0];
                }

                if( t[1] == NodeType::Internal ) {
                    auto internal1 = (InternalNode*)nodes[1];
                    n[1] = internal1->getChildAt( m[1] );
                    c[1] = codes[1];
                    c[1].down();
                    c[1].setBack( m[1] );
                } else {
                    n[1] = nodes[1];
                    c[1] = codes[1];
                }

                faceProc( m, dir, n, c );
            }

            // Process edges internal to current face pair.
            for( size_t i = 0; i < 4; i++ ) {
                int m[4];
                INode* n[4];
                LocationCode c[4];

                for( size_t j = 0; j < 4; j++ ) {
                    m[j] = faceProcEdgeMask[dir][i][j];

                    auto order = faceProcEdgeMask[dir][i][j + 4];
                    assert( order == 0 || order == 1 );

                    c[j] = codes[order];

                    assert( t[order] != NodeType::Homo );
                    if( t[order] == NodeType::Hetero ) {
                        n[j] = nodes[order];
                    } else {
                        c[j].down();
                        c[j].setBack( m[j] );

                        n[j] = dynamic_cast<InternalNode*>( nodes[order] )
                                   ->getChildAt( m[j] );
                    }
                }

                auto d = faceProcEdgeMask[dir][i][8];
                edgeProc( m, d, n, c );
            }
        }
    }

    // -- edgeProc function --
    void edgeProc( int mask[4],
                   int dir,
                   INode* nodes[4],
                   LocationCode codes[4] ) {
        NodeType t[4] = {nodes[0]->getType(),
                         nodes[1]->getType(),
                         nodes[2]->getType(),
                         nodes[3]->getType()};

        auto cond0 = t[0] == NodeType::Hetero || t[0] == NodeType::Homo;
        auto cond1 = t[1] == NodeType::Hetero || t[1] == NodeType::Homo;
        auto cond2 = t[2] == NodeType::Hetero || t[2] == NodeType::Homo;
        auto cond3 = t[3] == NodeType::Hetero || t[3] == NodeType::Homo;

        if( cond0 && cond1 && cond2 && cond3 ) {
            // if( t[0] == NodeType::Hetero && t[1] == NodeType::Hetero &&
            // t[2] == NodeType::Hetero && t[3] == NodeType::Hetero ) {

            vector<size_t> indices;
            vector<HeteroNode*> n;
            vector<Vec3f> p;
            vector<Vec3f> norms;

            for( size_t i = 0; i < 4; i++ ) {
                if( t[i] == NodeType::Hetero ) {
                    bool unique = true;
                    for( size_t j = 0; j < n.size(); j++ ) {
                        if( nodes[i] == (INode*)n[j] ) {
                            unique = false;
                            break;
                        }
                    }
                    if( unique ) {
                        indices.emplace_back( i );
                        n.emplace_back( (HeteroNode*)nodes[i] );
                        p.emplace_back( n.back()->hermiteData.minimizer );
                        norms.emplace_back( n.back()->hermiteData.normal );
                    }
                }
            }

            if( indices.size() > 2 ) {
                // Find deepest node.
                int maxDepth = -1;
                int iMaxDepth = -1;
                for( size_t i = 0; i < indices.size(); i++ ) {
                    auto depth = codes[indices[i]].getDepth();
                    if( (int)depth > maxDepth ) {
                        maxDepth = depth;
                        iMaxDepth = indices[i];
                    }
                }

                assert( maxDepth > 0 );
                assert( iMaxDepth > -1 );

                assert( nodes[iMaxDepth]->getType() == NodeType::Hetero );

                array<array<array<int, 2>, 3>, 4> fooTable;
                fooTable[0][0] = {0, 1};
                fooTable[0][1] = {0, 3};
                fooTable[0][2] = {0, 4};

                fooTable[1][0] = {2, 3};
                fooTable[1][1] = {4, 7};
                fooTable[1][2] = {1, 5};

                fooTable[2][0] = {6, 7};
                fooTable[2][1] = {5, 6};
                fooTable[2][2] = {2, 6};

                fooTable[3][0] = {4, 5};
                fooTable[3][1] = {1, 2};
                fooTable[3][2] = {3, 7};

                int index0 = 0, index1 = 0;

                index0 = fooTable[iMaxDepth][dir / 2][0];
                index1 = fooTable[iMaxDepth][dir / 2][1];

                auto heteroMaxDepth = (HeteroNode*)nodes[iMaxDepth];
                auto codeMaxDepth = codes[iMaxDepth];

                auto m0 = heteroMaxDepth->m[index0];
                auto m1 = heteroMaxDepth->m[index1];

                auto c0 = codeMaxDepth.getCorners()[index0];
                auto c1 = codeMaxDepth.getCorners()[index1];

                auto flipAxis =
                    m1 != Material::Vacuum ? ( c0 - c1 ) : ( c1 - c0 );

                auto signChange = m0 != m1;

                auto mSolid = m0 != Material::Vacuum ? m0 : m1;

                if( signChange ) {
                    if( n.size() == 4 ) {
                        if( useHardNormalsFlag ) {
                            bool badNormal0 =
                                p[0] == p[1] || p[1] == p[2] || p[3] == p[0];

                            bool badNormal1 =
                                p[0] == p[2] || p[2] == p[3] || p[3] == p[0];

                            bool badNormal2 = lengthSquared( p[0] ) <= 0.0 ||
                                              lengthSquared( p[1] ) <= 0.0 ||
                                              lengthSquared( p[2] ) <= 0.0 ||
                                              lengthSquared( p[3] ) <= 0.0;

                            // badNormal0 = false;
                            // badNormal1 = false;
                            // if( !badNormal0 && !badNormal1 && !badNormal2 ) {
                            if( true ) {
                                assert( p[0] != p[1] );
                                assert( p[1] != p[2] );
                                assert( p[2] != p[0] );

                                assert( p[0] != p[2] );
                                assert( p[2] != p[3] );
                                assert( p[3] != p[0] );

                                auto cross0 = cross( p[1] - p[0], p[2] - p[1] );
                                assert( lengthSquared( cross0 ) > 0.0 );
                                auto n0 = normalize(
                                    cross( p[1] - p[0], p[2] - p[1] ) );
                                norms[0] = n0;
                                norms[1] = n0;
                                norms[2] = n0;

                                auto cross1 = cross( p[2] - p[0], p[3] - p[0] );
                                assert( lengthSquared( cross1 ) > 0.0 );
                                auto n1 = normalize( cross1 );
                                norms[0] = n1;
                                norms[2] = n1;
                                norms[3] = n1;

                                bool swap0 = false;
                                if( dot( n0, flipAxis ) < 0.0 ) {
                                    n0 = -n0;
                                    std::swap( p[0], p[2] );
                                    swap0 = true;
                                }

                                array<Vec3f, 6> vNorms;

                                vNorms[0] = csg::calculateGrad( p[0], distF );
                                vNorms[1] = csg::calculateGrad( p[1], distF );
                                vNorms[2] = csg::calculateGrad( p[2], distF );

                                vNorms[0] = n0;
                                vNorms[1] = n0;
                                vNorms[2] = n0;

                                triangleMesh->emplace_back(
                                    p[0], vNorms[0], color0 );
                                triangleMesh->emplace_back(
                                    p[1], vNorms[1], color0 );
                                triangleMesh->emplace_back(
                                    p[2], vNorms[2], color0 );

                                if( swap0 ) {
                                    std::swap( p[0], p[2] );
                                }

                                if( dot( n1, flipAxis ) < 0.0 ) {
                                    n1 = -n1;
                                    std::swap( p[0], p[2] );
                                }

                                vNorms[3] = csg::calculateGrad( p[0], distF );
                                vNorms[4] = csg::calculateGrad( p[2], distF );
                                vNorms[5] = csg::calculateGrad( p[3], distF );

                                vNorms[3] = n1;
                                vNorms[4] = n1;
                                vNorms[5] = n1;

                                triangleMesh->emplace_back(
                                    p[0], vNorms[3], color1 );
                                triangleMesh->emplace_back(
                                    p[2], vNorms[4], color1 );
                                triangleMesh->emplace_back(
                                    p[3], vNorms[5], color1 );
                            }
                        }

                    } else if( n.size() == 3 ) {
                        if( useHardNormalsFlag ) {
                            bool badNormal =
                                p[0] == p[1] || p[1] == p[2] || p[2] == p[0];

                            bool badNormal2 = lengthSquared( p[0] ) <= 0.0 ||
                                              lengthSquared( p[1] ) <= 0.0 ||
                                              lengthSquared( p[2] ) <= 0.0;
                            // badNormal = false;
                            // if( !badNormal && !badNormal2 ) {
                            if( true ) {
                                assert( p[0] != p[1] );
                                assert( p[1] != p[2] );
                                assert( p[2] != p[0] );

                                auto cross0 = cross( p[1] - p[0], p[2] - p[1] );
                                assert( lengthSquared( cross0 ) > 0.0 );
                                auto n0 = normalize( cross0 );
                                norms[0] = n0;
                                norms[1] = norms[0];
                                norms[2] = norms[0];

                                if( dot( n0, flipAxis ) < 0.0 ) {
                                    n0 = -n0;
                                    std::swap( p[0], p[2] );
                                }
                                array<Vec3f, 3> vNorms;
                                vNorms[0] = csg::calculateGrad( p[0], distF );
                                vNorms[1] = csg::calculateGrad( p[1], distF );
                                vNorms[2] = csg::calculateGrad( p[2], distF );

                                vNorms[0] = n0;
                                vNorms[1] = n0;
                                vNorms[2] = n0;

                                triangleMesh->emplace_back(
                                    p[0], vNorms[0], color0 );
                                triangleMesh->emplace_back(
                                    p[1], vNorms[1], color0 );
                                triangleMesh->emplace_back(
                                    p[2], vNorms[2], color0 );

                                // getDebugRenderer()->drawLine(
                                // p[0], p[0] + 0.05 * vNorms[0], Magenta );
                                // getDebugRenderer()->drawLine(
                                // p[1], p[1] + 0.05 * vNorms[1], Magenta );
                                // getDebugRenderer()->drawLine(
                                // p[2], p[2] + 0.05 * vNorms[2], Magenta );
                            }
                        }
                    }
                }
            }

        } else if( t[0] == NodeType::Internal || t[1] == NodeType::Internal ||
                   t[2] == NodeType::Internal || t[3] == NodeType::Internal ) {
            auto test0 = codes[0] == LocationCode( {2, 0, 5} );
            auto test1 = codes[1] == LocationCode( {2, 0, 4} );
            auto test2 = codes[2] == LocationCode( {1, 3, 7} );
            auto test3 = codes[3] == LocationCode( {1, 3, 6} );

            auto test = test0 && test1 && test2 && test3;

            // assert( !test );

            size_t possibleHeteros = 0;
            for( size_t i = 0; i < 4; i++ ) {
                if( t[i] == NodeType::Internal || t[i] == NodeType::Hetero ) {
                    possibleHeteros++;
                }
            }

            if( possibleHeteros > 2 ) {
                for( size_t i = 0; i < 2; i++ ) {
                    int m[4];
                    INode* n[4];
                    LocationCode c[4];

                    assert( dir == edgeProcEdgeMask[dir][8] );
                    m[0] = edgeProcEdgeMask[dir][i];
                    m[1] = edgeProcEdgeMask[dir][i + 2];
                    m[2] = edgeProcEdgeMask[dir][i + 4];
                    m[3] = edgeProcEdgeMask[dir][i + 6];

                    for( size_t j = 0; j < 4; j++ ) {
                        if( t[j] == NodeType::Internal ) {
                            n[j] = dynamic_cast<InternalNode*>( nodes[j] )
                                       ->getChildAt( m[j] );
                            c[j] = codes[j];
                            c[j].down();
                            c[j].setBack( m[j] );
                        } else {
                            n[j] = nodes[j];
                            c[j] = codes[j];
                        }
                    }

                    edgeProc( m, dir, n, c );
                }
            }
        }
    }

}  // namespace arc
Jacques Nel
  • 581
  • 2
  • 11
  • 2
    The code you're asking about _must_ be included in the question, as links can go stale or change. Without it, your question lacks a [mcve]. – 1201ProgramAlarm Feb 26 '18 at 20:37
  • 1
    Dunno if this helps but https://stackoverflow.com/questions/2431310/graph-algorithms-on-gpu Personally I've been struggling to implement the _Ju et al._ paper just on a CPU. – hegel5000 Feb 26 '18 at 20:57
  • Research 'Tail Recursion'. Converting to 'tail recursion' might be easily done. Pros - TR runs as fast as iteration, it does not burn thru the stack. Cons - my code ran in TR mode only when compiled with higher optimization, making debug somewhat more difficult. – 2785528 Feb 26 '18 at 20:59
  • @1201ProgramAlarm The link to a GitHub Gist should be stable. The code is quite long, but I will in-line directly if that improves the question. – Jacques Nel Feb 26 '18 at 21:57
  • @DOUGLASO.MOEN I am familiar with tail recursion being equivalent to iteration. I’m aware that compilers typically convert to a iteration, but I don’t think converting the algorithm to tail recursion is trivial. Regardless, recursion is not an option for converting to a parallel OpenCL implementation. – Jacques Nel Feb 26 '18 at 22:01
  • Thank you @hegel5000, I will check it out. – Jacques Nel Feb 26 '18 at 22:02

0 Answers0