find iso-cost points on a 3d grid efficiently with minimum costing of points

Question

I have a 3d grid where in each point (x,y,z) on the grid is associated with a cost value. The cost of any point (x,y,z) is not known in advance. To know the cost, we need to make a complex query which is really expensive. One thing we know about the object is that cost is monotonically non-decreasing in all 3 dimensions.

Now given a cost C, I need to find the points (x,y,z) on the surface which have cost C. This has to be done by costing only bare minimum. How to solve my problem?

When I searched online, I am getting contour identification related techniques but all these techniques assume all point's cost is known in advance like Marching cubes method etc. In my case major metric is the number of points costed should be minimum.

It would be helpful if some one can suggest a way to get approximate locations at least if not exact.

It would be helpful if you added the code for the cost function or just outline it. However, by "monotonically non-decreasing in all 3 dimensions" do you mean that increasing any of the coordinates increases the cost value? Seems like using k-d tree would help. Actually, you could probably use the structure of the k-d tree as the search, though it might not be the most efficient way. Though an octree might work better. In theory either should give you about O(log N) queries if the data structures are a good match for your cost functions. You also might want to consider interpolation search. — Nuclearman, Dec 03 '14 at 07:22
@Nuclearman You can imagine cost function like this. There is a black box which when given (x,y,z), gives you back cost(x,y,z) but calling this black box is a time-consuming operation which we have to reduce. — CRM, Dec 03 '14 at 08:43
@Nuclearman yes increasing any of the coordinates increases cost value. Regarding k-d tree, can we use these data structures given that we dont know cost of any location in advance and we are trying to minimize explored locations? — CRM, Dec 03 '14 at 08:44
@Nuclearman thanks for the reply. Dont we need cost of all locations before constructing k-d tree or octree? — CRM, Dec 03 '14 at 10:22
Roughly how many points will there be and what is the performance of the cost function? — Nuclearman, Dec 05 '14 at 09:45
@Nuclearman The resolution of 3d object would be like: n * n * n where n = 100 or 200 or 300, ..upto 1000. Cost function is very expensive to evaluate for any given (x,y,z). Hence it must be evaluated only at very less number of locations. — CRM, Dec 05 '14 at 10:05
How do the cost compare when the x value is higher, but the y value is lower or vice versa? If the cost can be higher or lower in that case, then the recursive approach I had in mind might not work as well. — Nuclearman, Dec 05 '14 at 10:50
@Nuclearman cost is monotonically non-decreasing in each dimension — CRM, Dec 05 '14 at 17:59
Do you have a *cubic lattice* (points p_ijk = (x_k, y_j, z_i)), some other regular lattice, or a 3D object composed of vertices, edges and faces? My approach would definitely depend on this. — Nominal Animal, Dec 08 '14 at 04:03
@NominalAnimal you can think of 3d object as a grid of 100*100*100 points to start with — CRM, Dec 08 '14 at 12:51
No! An *object* is a [closed surface](http://en.wikipedia.org/wiki/Surface#Closed_surfaces). What you have sounds like a [scalar field](http://en.wikipedia.org/wiki/Scalar_field) sampled in a [regular grid](http://en.wikipedia.org/wiki/Regular_grid), except that taking/calculating each sample is costly. Perhaps you wish to find the [isosurface](http://en.wikipedia.org/wiki/Isosurface) of the scalar field, corresponding to some specific value, while examining the minimal number of samples? Or do you have some other rule or structure that defines the surface you wish to limit the sampling to? — Nominal Animal, Dec 08 '14 at 18:38
@NominalAnimal scalar field sampled in a regular grid, except that taking/calculating each sample is costly best describes it — CRM, Dec 09 '14 at 07:10
@NominalAnimal I have edited the question to mention grid. yes I have to find iso surface corresponding to specific value while examining minimal number of samples. The only clue about the grid is that cost is monotonically non-decreasing — CRM, Dec 09 '14 at 07:12

Nominal Animal · Answer 1 · 2014-12-15T11:38:14.047

Rewritten explanation: (original text, in case it might clarify the idea to someone, is kept unchanged below the line)

We have some function f(x,y,z) in three dimensions, and we wish to find the surface f(x,y,z) = c. Since the function yields a single number, it defines a scalar field, and the surface we are looking for is the isosurface c.

In our case, evaluating the function f(x,y,z) is very costly, so we wish to minimize the number of times we use it. Unfortunately, most isosurface algorithms assume the opposite.

My suggestion is to use a similar isosurface walk as Fractint could use for two-dimensional fractals. Code-wise, it is complicated, but it should minimize the amount of function evaluations needed -- that was exactly the purpose it was implemented in Fractint.

Background / History:

In the late 1980s and early 1990s, I encoutered a fractal drawing suite Fractint. Computers were much slower then, and evaluating each point was painfully slow. A lot of effort was made in Fractint to make it display the fractals as fast as possible, but still accurately. (Some of you might remember the color-cycling it could do, by rotating the colors in the palette used. It was hypnotic; here is a Youtube clip from the 1995 documentary "Colors of Infinity", which both color-cycles and zooms in. Calculating a full-screen fractal could take hours (at high zoom factors, close to the actual fractal set), but then you could (save it as an image and) use the color-cycling to "animate" it.)

Some of those fractals were, or had regions, where the number of iterations needed was monotonically non-decreasing toward the actual fractal set fractal -- that is, no "islands" sticking out, just steady occasional increase in iteration steps --, one fast evaluation mode used edge tracing to locate the boundary where the number of iterations changed: in other words, the regions filled with a single color. After closing a region, it then traced towards the center of that region to find the next iteration edge; after that was closed too, it could just fill the donut- or C-shaped region between those boundaries with the correct constant color, without evaluating the function for those pixels!

Here, we have a very similar situation, except in three dimensions instead of two. Each isosurface is also two-dimensional by definition, so really, all that changes, is how we walk the boundary.

The walk itself is similar to flood fill algorithms, except that we walk in three dimensions, and our boundary is the isosurface we're tracing.

We sample the original function in a regular grid, say an N×N×N grid. (This is not the only possibility, but it is the easiest and most useful case, and what the OP is doing.)

In general, the isosurfaces will not pass through the grid points exactly, but between the grid points. Therefore, our task is to find the grid cells the isosurface passes through.

In an N×N×N regular grid, there are (N-1)×(N-1)×(N-1) cubic cells: One cubic cells in a grid

Each cell has eight corners at (x,y,z), (x+1,y,z), (x,y+1,z), (x+1,y+1,z), (x,y,z+1), (x+1,y,z+1), (x,y+1,z+1), and (x+1,y+1,z+1), where x,y,Z ∈ ℕ are the integer grid coordinates, 0 ≤ x,y,z ≤ N-2 are the integer grid coordinates.

Carefully note the integer grid coordinate limits. If you think about it, you'll realize that an N×N×N grid has only (N-1)×(N-1)×(N-1) cells, and since we use the grid coordinates for the corner closest to origin, the valid coordinate range for that corner is 0 to N-2, inclusive.

If f(x,y,z) increases monotonically in each dimension, then isosurface c passes through cell (x,y,z) if

f(x,y,z) ≤ c

and at least one of

f(x+1, y,   z) > c
f(x,   y+1, z) > c
f(x+1, y+1, z) > c
f(x,   y,   z+1) > c
f(x+1, y,   z+1) > c
f(x,   y+1, z+1) > c
f(x+1, y+1, z+1) > c

If f(x,y,z) is monotonically non-decreasing -- that is, its partial derivatives are either zero or positive at all points --, then the above locates two-dimensional isosurfaces, and the outer surface for isovolumes (volumes where f(x,y,z) is constant). The inner surface for isovolumes c are then those cells (x,y,z) for which

f(x,y,z) < c

and at least one of

f(x+1, y,   z) ≥ c
f(x,   y+1, z) ≥ c
f(x+1, y+1, z) ≥ c
f(x,   y,   z+1) ≥ c
f(x+1, y,   z+1) ≥ c
f(x,   y+1, z+1) ≥ c
f(x+1, y+1, z+1) ≥ c

Extension to any scalar function:

The approach shown here actually works for any f(x,y,z) that has only one maximum within the sampled region, say at (x_MAX,y_MAX,z_MAX); and only one minimum, say at (x_MIN,y_MIN,z_MIN); with no local maxima or minima within the sampled region.

In that case, the rule is that at least one of f(x,y,z), f(x+1,y,z), f(x,y+1,z), f(x+1,y+1,z), f(x,y,z), f(x+1,y,z), f(x,y+1,z), f(x+1,y+1,z) must be below or equal to c, and at least one above or equal to c, and not all equal to c.

Also, an initial cell an isosurface c passes through can then always be found using a binary search between (x_MAX,y_MAX,z_MAX) and (x_MIN,y_MIN,z_MIN), limiting the coordinates to 0 ≤ x_MAX,y_MAX,z_MAX,x_MIN,y_MIN,z_MIN ≤ N-2 (to only consider valid cells, in other words).

If the function is not monotonic, locating an initial cell the isosurface c passes through is more complicated. In that case, you need a different approach. (If you can find the grid coordinates for all local maxima and minima, then you can do binary searches from global minimum to local maxima above c, and from local minima below c to global maximum.)

Because we sample the function f(x,y,z) at intervals, we implicitly assume it to be continous. If that is not true -- and you need to show also the discontinuities -- you can augment the grid with discontinuity information at each point (seven boolean flags or bits per grid point, for "discontinuity from (x,y,z) to (x+,y+,z+)"). The surface walking then must also respect (not cross) such discontinuities.

In practice, I would use two arrays to describe the grid: one for cached samples, and one for two flags per grid point. One flag would describe that the cached value exists, and another that the walking routine has already walked the grid cell at that point. The structure I'd use/need for walking and constructing isosurfaces (for a monotonically non-decreasing function sampled in a regular grid) would be

typedef struct {
    size_t  xsize;
    size_t  ysize;
    size_t  zsize;
    size_t  size;  /* xsize * ysize * zsize */

    size_t  xstride; /* [z][y][x] array = 1 */
    size_t  ystride; /* [z][y][x] array = xsize */
    size_t  zstride; /* [z][y][x] array = xsize * ysize */

    double  xorigin; /* Function x for grid coordinate x = 0 */
    double  yorigin; /* Function y for grid coordinate y = 0 */
    double  zorigin; /* Function z for grid coordinate z = 0 */
    double  xunit;   /* Function x for grid coordinate x = 1 */
    double  yunit;   /* Function y for grid coordinate y = 1 */
    double  zunit;   /* Function z for grid coordinate z = 1 */

    /* Function to obtain a new sample */
    void   *data;
    double *sample(void *data, double x, double y, double z);

    /* Walking stack */
    size_t  stack_size;
    size_t  stack_used;
    size_t *stack;

    unsigned char *cell;  /* CELL_ flags */
    double        *cache; /* Cached samples */
} grid;

#define CELL_UNKNOWN (0U)
#define CELL_SAMPLED (1U)
#define CELL_STACKED (2U)
#define CELL_WALKED  (4U)

double grid_sample(const grid *const g, const size_t gx, const size_t gy, const size_t gz)
{
    const size_t i = gx * g->xstride + gy * g->ystride + gz * g->zstride;
    if (!(g->cell[i] & CELL_SAMPLED)) {
        g->cell[i] |= CELL_SAMPLED;
        g->cache[i] = g->sample(g->data, g->xorigin + (double)gx * g->xunit,
                                         g->yorigin + (double)gy * g->yunit,
                                         g->zorigin + (double)gz * g->zunit);
    }
    return g->cache[i];
}

and the function to find the cell to start the walk on, using a binary search along the grid diagonal (assuming non-decreasing monotonic function, so all isosurfaces must cross the diagonal):

size_t grid_find(const grid *const g, const double c)
{
    const size_t none = g->size;
    size_t xmin = 0;
    size_t ymin = 0;
    size_t zmin = 0;
    size_t xmax = g->xsize - 2;
    size_t ymax = g->ysize - 2;
    size_t zmax = g->zsize - 2;
    double s;

    s = grid_sample(g, xmin, ymin, zmin);
    if (s > c) {
        return none;
    }
    if (s == c)
        return xmin*g->xstride + ymin*g->ystride + zmin*g->zstride;

    s = grid_sample(g, xmax, ymax, zmax);
    if (s < c)
        return none;
    if (s == c)
        return xmax*g->xstride + ymax*g->ystride + zmax*g->zstride;

    while (1) {
        const size_t x = xmin + (xmax - xmin) / 2;
        const size_t y = ymin + (ymax - ymin) / 2;
        const size_t z = zmin + (zmax - zmin) / 2;

        if (x == xmin && y == ymin && z == zmin)
            return x*g->xstride + y*g->ystride + z*g->zstride;

        s = grid_sample(g, x, y, z);
        if (s < c) {
            xmin = x;
            ymin = y;
            zmin = z;
        } else
        if (s > c) {
            xmax = x;
            ymax = y;
            zmax = z;
        } else
            return x*g->xstride + y*g->ystride + z*g->zstride;
    }
}

#define GRID_X(grid, index) (((index) / (grid)->xstride)) % (grid)->xsize) 
#define GRID_Y(grid, index) (((index) / (grid)->ystride)) % (grid)->ysize) 
#define GRID_Z(grid, index) (((index) / (grid)->zstride)) % (grid)->zsize)

The three macros above show how to convert the grid index back to grid coordinates.

To walk the isosurface, we cannot rely on recursion; the call chains would be too long. Instead, we have a walk stack for cell indexes we should examine:

static void grid_push(grid *const g, const size_t cell_index)
{
    /* If the stack is full, remove cells already walked. */
    if (g->stack_used >= g->stack_size) {
        const size_t         n = g->stack_used;
        size_t *const        s = g->stack;
        unsigned char *const c = g->cell;
        size_t               i = 0;
        size_t               o = 0;

        while (i < n)
            if (c[s[i]] & CELL_WALKED)
                i++;
            else
                s[o++] = s[i++];

        g->stack_used = o;
    }

    /* Grow stack if still necessary. */
    if (g->stack_used >= g->stack_size) {
        size_t *new_stack;
        size_t  new_size;

        if (g->stack_used < 1024)
            new_size = 1024;
        else
        if (g->stack_used < 1048576)
            new_size = g->stack_used * 2;
        else
            new_size = (g->stack_used | 1048575) + 1048448;

        new_stack = realloc(g->stack, new_size * sizeof g->stack[0]);
        if (new_stack == NULL) {
            /* FATAL ERROR, out of memory */
        }

        g->stack = new_stack;
        g->stack_size = new_size;
    }

    /* Unnecessary check.. */
    if (!(g->cell[cell_index] & (CELL_STACKED | CELL_WALKED)))
        g->stack[g->stack_used++] = cell_index;
}

static size_t grid_pop(grid *const g)
{
    while (g->stack_used > 0 &&
           g->cell[g->stack[g->stack_used - 1]] & CELL_WALKED)
        g->stack_used--;
    if (g->stack_used > 0)
        return g->stack[--g->stack_used];
    return g->size; /* "none" */
}

The function that verifies that the isosurface passes through the current cell, reports those to a callback function, and walks the isosurface, would be something like

int isosurface(grid *const g, const double c,
               int (*report)(grid *const g,
                             const size_t x, const size_t y, const size_t z,
                             const double c,
                             const double x0y0z0,
                             const double x1y0z0,
                             const double x0y1z0,
                             const double x1y1z0,
                             const double x0y0z1,
                             const double x1y0z1,
                             const double x0y1z1,
                             const double x1y1z1))
{
    const size_t xend = g->xsize - 2; /* Since we examine x+1, too */
    const size_t yend = g->ysize - 2; /* Since we examine y+1, too */
    const size_t zend = g->zsize - 2; /* Since we examine z+1, too */
    const size_t xstride = g->xstride;
    const size_t ystride = g->ystride;
    const size_t zstride = g->zstride;
    unsigned char *const cell = g->cell;
    double x0y0z0, x1y0z0, x0y1z0, x1y1z0,
           x0y0z1, x1y0z1, x0y1z1, x1y1z1; /* Cell corner samples */
    size_t x, y, z, i;
    int r;

    /* Clear walk stack. */
    g->stack_used = 0;

    /* Clear walked and stacked flags from the grid cell map. */
    i = g->size;
    while (i-->0)
        g->cell[i] &= ~(CELL_WALKED | CELL_STACKED);

    i = grid_find(g, c);
    if (i >= g->size)
        return errno = ENOENT; /* No isosurface c */

    x = (i / g->xstride) % g->xsize;
    y = (i / g->ystride) % g->ysize;
    z = (i / g->zstride) % g->zsize;

    /* We need to limit x,y,z to the valid *cell* coordinates. */
    if (x > xend) x = xend;
    if (y > yend) y = yend;
    if (z > zend) z = zend;
    i = x*g->xstride + y*g->ystride + z*g->zstride;

    if (x > xend || y > yend || z > zend)
        return errno = ENOENT; /* grid_find() returned an edge cell */

    grid_push(g, i);

    while ((i = grid_pop) < g->size) {
        x = (i / g->xstride) % g->xsize;
        y = (i / g->ystride) % g->ysize;
        z = (i / g->zstride) % g->zsize;

        cell[i] |= CELL_WALKED;

        x0y0z0 = grid_sample(g,   x,   y,   z);
        if (x0y0z0 > c)
            continue;

        x1y0z0 = grid_sample(g, 1+x,   y,   z);
        x0y1z0 = grid_sample(g,   x, 1+y,   z);
        x1y1z0 = grid_sample(g, 1+x, 1+y,   z);
        x0y0z1 = grid_sample(g,   x,   y, 1+z);
        x1y0z1 = grid_sample(g, 1+x,   y, 1+z);
        x0y1z1 = grid_sample(g,   x, 1+y, 1+z);
        x1y1z1 = grid_sample(g, 1+x, 1+y, 1+z);

        /* Isosurface does not pass through this cell?!
         * (Note: I think this check is unnecessary.) */
        if (x1y0z0 < c && x0y1z0 < c && x1y1z0 < c &&
            x0y0z1 < c && x1y0z1 < c && x0y1z1 < c &&
            x1y1z1 < c)
            continue;

        /* Report the cell. */
        if (report) {
            r = report(g, x, y, z, c, x0y0z0, x1y0z0,
                       x0y1z0, x1y1z0, x0y0z1, x1y0z1,
                       x0y1z1, x1y1z1);
            if (r) {
                errno = 0;
                return r;
            }
        }

        /* Could the surface extend to -x? */
        if (x > 0 &&
            !(cell[i - xstride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x0y1z0 >= c || x0y0z1 >= c ))
            grid_push(g, i - xstride);

        /* Could the surface extend to -y? */
        if (y > 0 &&
            !(cell[i - ystride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x0y0z1 >= c || x1y0z0 >= c ))
            grid_push(g, i - ystride);

        /* Could the surface extend to -z? */
        if (z > 0 &&
            !(cell[i - zstride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x1y0z0 >= c || x0y1z0 >= c ))
            grid_push(g, i - zstride);

        /* Could the surface extend to +x? */
        if (x < xend &&
            !(cell[i + xstride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x0y1z0 >= c || x0y0z1 >= c ))
            grid_push(g, i + xstride);

        /* Could the surface extend to +y? */
        if (y < xend &&
            !(cell[i + ystride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x1y0z0 >= c || x0y0z1 >= c ))
            grid_push(g, i + ystride);

        /* Could the surface extend to +z? */
        if (z < xend &&
            !(cell[i + zstride] & (CELL_WALKED | CELL_STACKED)) &&
            ( x1y0z0 >= c || x0y1z0 >= c ))
            grid_push(g, i + zstride);
    }

    /* All done. */
    errno = 0;
    return 0;
}

In this particular case, I do believe the isosurfaces are best visualized/described using a polygon mesh, with samples within a cell linearly interpolated. Then, each report() call produces one polygon (or one or more flat triangles).

Note that the cell has 12 edges, and the isosurface must cross at least three of these. Let's assume we have two samples at corners c0 and c1, spanned by an edges, with the two corners having coordinates p0=(x0,y0,z0) and p1=(x1,y1,z1) respectively:

if (c0 == c && c1 == c)
   /* Entire edge is on the isosurface */
else
if (c0 == c)
   /* Isosurface intersects edge at p0 */
else
if (c1 == c)
   /* Isosurface intersects edge at p1 */
else
if (c0 < c && c1 > c)
   /* Isosurface intersects edge at p0 + (p1-p0)*(c-c0)/(c1-c0) */
else
if (c0 > c && c1 < c)
   /* Isosurface intersects edge at p1 + (p0-p1)*(c-c1)/(c0-c1) */
else
   /* Isosurface does not intersect the edge */

The above check is valid for any kind of continuous function f(x,y,z); for non-monotonic functions the problem is just finding the relevant cells. The isosurface() function needs some changes (the checks wrt. x0y0z0..x1y1z1), according to the rules outlined earlier in this post, but it too can be made to work for any continuous function f(x,y,z) with little effort.

Constructing the polygon/triangle(s) when the samples at the cell corners are known, especially using linear interpolation, is very simple as you can see.

Note that there is usually no reason to worry about the order in which the edges of a cell are checked, as you will almost certainly use vector calculus and cross product in particular to orient the points and polygons. Or, if you like, you can do Delaunay triangulation on the points (3 to 12 for any function, although more than 6 points indicates there are two separate surfaces, I believe) to construct flat polygons.

Questions? Comments?

We have a scalar field f(x,y,z) in three dimensions. The field is costly to sample/evaluate, and we do so only at integer coordinates 0 ≤ x,y,z ∈ ℕ. To visualize the scalar field, we wish to locate one or more isosurfaces (surfaces with a specific f(x,y,z) value), using the minimum number of samples/evaluations.

The approach I'll try to describe here is a variant of the algorithm used in fractint, to minimize the number of iterations needed to draw certain fractals. Some fractals have large areas with the same "value", so instead of sampling every point within the area, certain drawing mode traced the edges of those areas.

In other words, instead of locating individual points of the isosurface c, f(x,y,z) = c, you can locate just one point, and then walk the isosurface. The walk part is a bit complicated to visualize, but it really is just a 3D variant of the flood fill algorithm used in simple computer graphics. (Actually, given the field is monotonically non-decreasing along each dimension, it'll actually be a mostly 2D walk, with typically just a few grid points other than those relevant to the isosurface c sampled. This should be really efficient.)

I'm pretty sure there are good peer-reviewed papers describing this very technique (probably in more than one problem domain), but since I'm too lazy to do a better search than a couple of minutes of Google searches, I leave it to others to find good references. Apologies.

For simplicity, for now, let's assume that the field is continuous and monotonically increasing along each dimension. Within an axis-oriented box of size N×N×N, the field will have a minimum at one corner at origin (0,0,0), a maximum at the far corner from origin, at (N,N,N), with all possible values between the minimum and maximum found along the diagonal from (0,0,0) to (N,N,N). In other words, that every possible isosurface exists and is a continuous 2D surface, excluding points (0,0,0) and (N,N,N), and every such surface intersects the diagonal.

If the field is actually non-continuous, we won't be able to tell, because of our sampling method. In practice, our sampling means we implicitly assume the scalar field is continuous; we will treat is as continuous, whether or not it really is!

If the function is actually monotonically increasing along each dimension, then it is possible to map f(x,y,z)=c to X(y,z)=x, Y(x,z)=y, Z(x,y)=z, although any one of the three is sufficient to define the isosurface c. This is because the isosurface can only cross any line spanning the box in at most one point.

If the function is monotonically non-decreasing instead, the isosurface can intersect any line spanning the box still only once, but the intersection can be wider (than a point) along the line. In practice, you can handle this by considering only the lower or upper surfaces of the isovolumes (volumes with a static field); i.e. only the transition from-lower-than-c-to-c-or-greater, or the transition from-c-or-lower-to-greater-than-c. In all cases, you're not really looking for the isosurface value c, but trying to locate where a pair of the field samples crosses c.

Because we sample the field at regular grid points, and the isosurface rarely (if ever) intersects those grid points exactly, we divide the original box into N×N×N unit-sized cubes, and try to find the cubes the desired isosurface intersects.

Here is a simple illustration of one such cube, at (x,y,z) to (x+1,y+1,z+1): example unit cube

When the isosurface intersects a cube, it intersects at least one of the edges marked X, Y, or Z, and/or the diagonal marked D. In particular, we'll have f(x,y,z) ≤ c, and one or more of:

f(x+1,y,z) > c (isosurface c crosses the cube edge marked with X) (Note: In this case, we wish to walk along the y and z dimensions)
f(x,y+1,z) > c (isosurface c crosses the cube edge marked with Y) (Note: In this case, we wish to walk along the x and z dimensions)
f(x,y,z+1) > c (isosurface c crosses the cube edge marked with Z) (Note: In this case, we wish to walk along the x and y dimensions)
f(x+1,y+1,z+1) > c (isosurface c crosses the cube diagonal, marked with D) (Note: In this case, we may need to examine all directly connected grid points, to see which direction we need to walk to.)

Instead of doing a complete search of the original volume, we can just find one such cube, and walk along the cubes to discover the cubes the isosurface intersects.

Since all isosurfaces have to intersect the diagonal from (0,0,0) to (N,N,N), we can find such a cube using just 2+ceil(log₂(N)) samples, using a binary search over the cubes on the diagonal. The target cube (i,i,i) is the one for which f(i,i,i) ≤ c and f(i+1,i+1,i+1) > c. (For monotonically non-decreasing fields with isovolumes, this shows the isovolume surface closer to origin as the isosurface.)

When we know that the isosurface c intersects a cube, we can use basically three approaches to convert that knowledge to a point (that we consider the isosurface to intersect):

The cube has eight corners, each at a grid point. We can pick the corner/grid point with the field value closest to c.
We can interpolate -- choose an approximate point -- where the isosurface c intersects the edge/diagonal. We can do linear interpolation without any extra samples, since we already know the samples at the ends of the crossed edge/diagonal. If u = f(x,y,z) < c, and v > c is the sample at the other end, the linearly interpolated intersection point along that line occurs at (c-u)/(v-u), with 0 being at (x,y,z), and 1 being at the other end of the edge/diagonal (at (x+1,y,z), (x,y+1,z), (x,y,z+1), or (x+1,y+1,z+1)).
You can use a binary search along the edge/diagonal, to find the intersection point. This needs n extra samples per edge/diagonal, to get the intersection point at n-bit accuracy along the edge/diagonal. (As the original grid cannot be too coarse compared to the details in the field, or the details will not be visible anyway, you normally use something like n=2, n=3, n=4, or n=5 at most.)

The intersection points for the isosurface c thus obtained can be used for fitting some surface function, but I have not seen that in real life. Typically, Delaunay triangulation is used to convert the point set to a polygon mesh, which is then easy to visualize.

Another option is to remember which cube ((x,y,z)) and edge/diagonal (X, Y, or Z edge, or D for diagonal) each point is related to. Then, you can form a polygon mesh directly. Voxel techniques can also be used to quickly draw partially transparent isosurfaces; each view ray examines each cube once, and if the isosurface is present, the isosurface intersection points can be used to interpolate a surface normal vector, producing very smooth and accurate-looking isosurfaces with raycasting/raytracing methods (without creating any polygon mesh).

It seems to me I this answer is in need of editing -- at minimum, some sleep and further thought, and clarifications. Questions, suggestions, and even edits are welcome!

If there is interest from more than just the OP, I could try and see if I can cobble together a simple example C program for this. I've toyed with visualizing simulated electronic structures, and those fields are not even monotonic (although sampling is cheap).

Thanks for the effort you have put in writing this answer. I am working along the lines you suggested. Meanwhile as you said it would be extremely helpful if you can consolidate the points with a pseudo-code if not a C program. — CRM, Dec 10 '14 at 12:21
A quick note, the way cost function calculation might allow for an interpolation search based approach rather than binary search, allowing for something like O(log log N) samples instead (Though it's it's not clear if that's practical. Also, nice diagram. — Nuclearman, Dec 11 '14 at 19:23
@Nuclearman: Sure, but a binary search here takes 2+log2(max(xsize,ysize,zsize)). Even for a 1024×1024×1024 grid, we only need 2+log2(1024) = 12 samples at most; 18 for 65536×65536×65536. And two of those are always at min and max. Anyway, if we can guess an approximate form for the field function, we could sample it to fit the coefficients, then use random sampling to compare the fit to the actual function, to find out how reliable/precise the fit is. — Nominal Animal, Dec 12 '14 at 09:28
Fair enough, though Log log N would half the samples or so. If the sampling process is as expensive as claimed, it might be worth it. Anyway, you now have a rather impressively long answer now. — Nuclearman, Dec 12 '14 at 10:41
@Nuclearman: The number of samples relevant to any reasonable isosurface is at least an order of magnitude larger, I'd say tens at minimum. That's why the initial binary search is not that relevant. A "perfect" walk only looks at the samples immediately surrounding the surface. I would have preferred a much shorter answer, but not locating any descriptions of the method, and not having implemented this walking/tracing variant myself, the description is .. lacking; vague, and overly verbose. Any ideas on how to make it shorter? :) — Nominal Animal, Dec 12 '14 at 14:07
Fair point, I suppose I was more thinking of it's benefit to my own approach (which I outlined somewhat in the comments area of my answer, but didn't add in the answer), where a diagonal search is used, then upwards of 8 additional searches are used along the edges. Overall, the description does seem as you say, and is likely excessive on the field theory, but I can't point to anything specific that can be easily removed without detracting from the answer. — Nuclearman, Dec 12 '14 at 20:31
@NominalAnimal I have used your algorithm and checking it with some simple non-monotonic functions. It was working well until I had one issue. When isosurface() methods calls gridsample() method to get cost at all corners of the cube, there is no bound check in gridsample() as of now leading to out of bounds error on some inputs. I tried to fix it but couldn't yet. I dont know exactly what we should return from gridsample() when there is out of bounds. How to handle it when x or y or z reaches boundary? — CRM, Dec 14 '14 at 13:24
@NominalAnimal the java code(same code you published but in java) I have used is here http://ideone.com/FSn4of In this code, if we change cost variable value in main to cost = 15000, out of bounds error occurs for example. — CRM, Dec 14 '14 at 13:32
@NominalAnimal Other than the out of bounds problem, another problem I am facing is when we search for starting point (x,y,z) by binary search along diagonal, if we dont get exact match then isosurface() exits with just one iteration. How to handle that too — CRM, Dec 14 '14 at 17:21
@Rajmohan: The best option would be to set `xmax=g.xsize-2`, `ymax=g.ysize-2`, and `zmax=g.zsize-2` in `grid_find()`, and add a corresponding `if (s == c) return xmax*g.xstride+ymax*g.ystride+zmax*g.zstride;` check. (It might mean isosurfaces that pass only though the outermost level of cells might not be found.) In `isosurface()`, we should only walk cells `0 <= x <= g.xsize-2`, `0 <= y <= g.ysize-2`, `0 <= z <=g.zsize-2`, since we examine the `x+1`, `y+1`, and `z+1` cells. My code above had that bug too; fixed now. — Nominal Animal, Dec 15 '14 at 11:29
@Rajmohan: Your `grid_push()` does not copy the old stack contents to the new stack. You should copy the `g.stack_used` entries from the old to the new, since you're not *reallocating*, but allocating a completely new one. (C `realloc()` retains the contents, even if the pointer changes.) For `isosurface()`, compare to the changes I made to mine wrt. `xmax`, `ymax`, `zmax`; note the new checks, and changed comparisons (`<=` to `<` and so on). — Nominal Animal, Dec 15 '14 at 11:43
@NominalAnimal. If cost C is not found by initial binary search along diagonal in grid_find() method, how do I proceed ? As of now, algorithms stops with finding zero points. — CRM, Dec 17 '14 at 20:22
@Rajmohan: `grid_find()` does not need to find exact `c`, just a cell that *spans* c, i.e. `cost(x,y,z) ≤ c && cost(x+1,y+1,z+1) > c`. As long as `cost(xmin,ymin,zmin) < c` and `cost(xmax,ymax,zmax) > c` where `0 ≤ xmin,xmax ≤ xsize-2`, `0 ≤ ymin,ymax ≤ ysize-2`, and `0 ≤ zmin,zmax ≤ zsize-2`, a binary search should find such a cell. Could you verify that that is the case? (Say, add debug prints if the extreme points do satisfy that condition?) You could also sample the inner corners of the grid corner cells to find the minimum and maximum samples. — Nominal Animal, Dec 18 '14 at 06:12
@NominalAnimal In grid_find() inside while loop when if (x == xmin && y == ymin && z == zmin) gets satisfied and grid_find() returns, this problem occurs. At those times, in isosurface() method, after the seed location returned by grid_find(), there are no points pushed since all if conditions in while loop dont pass leading to exit of while loop in just one itertion in isosurface(). — CRM, Dec 19 '14 at 02:43
@Rajmohan: Yes. That is why we need to distinguish between *grid coordinates* and *cell coordinates*. We have one sample per grid node, but eight (2³) samples defining each cell. Please see my other code to see how I fix that. — Nominal Animal, Dec 30 '14 at 14:35

score 2 · Answer 2 · answered Dec 06 '14 at 14:08

2

You should look into this article which talks about the 2-dimensional case and gives you a great insight into the different methodologies: http://leetcode.com/2010/10/searching-2d-sorted-matrix.html

In my opinion, the step-wise linear search (in part II there) would be a great first step for you because it's very easy to apply to the 3-d case and it really doesn't require a lot of experience to understand.
Because this is so straightforward and still very efficient, I would go with this and see if it fits your needs for the kind of data you're working with in 3-d.

However, if your only goal is performance, then you should apply the binary partition to 3-d. This gets a little bit more complex because the 'binary partition' he talks about essentially becomes a 'binary plane partition'.
So you don't have a line partitioning your matrix into 2 possible smaller matrices.
Instead you have a plane partitioning your cube into 2 possible smaller cubes.
To make the search in that plane (or matrix) efficient, you would first have to implement one of his methods :).
Then you repeat everything with the smaller cubes.
Keep in mind that implementing this in a very efficient way (i.e. keeping memory access in mind) is not trivial.

answered Dec 06 '14 at 14:08

Matt Ko

969
7
14

I understand your suggestion. I will try that. Meanwhile, is there any references you aware of for 3-d? – CRM Dec 06 '14 at 14:18
I didn't really find anything specifically for 3d. As soon as you go into higher dimensions then 2, there are usually generic solutions for any number of dimensions. So it's really difficult to find an explanation for the special case of 3 dimensions. You can get into k-d trees as suggested in the comments already but that's more of a data structure than an algorithm... – Matt Ko Dec 06 '14 at 16:07
That's a really long article, and in the end it misses the simplest and best algorithm. Saddleback by Dijkstra runs in best and worst case `O(n)`. – btilly Dec 06 '14 at 20:54
@btilly: Saddleback is exactly what the article describes as step-wise linear search, it just doesn't call it out that way. For a n * m * o matrix, it runs in `O(n+m+o)`. This is why I suggested it in the first place. However, as you certainly know, best and worst case `O(n)` does not always make an algorithm better than a worst case `O(n lg n)` for example - see differences between RadixSort, MergeSort and QuickSort, out of those three, for some reason, the one with the slowest worst case seems to be used in almost all implementations. – Matt Ko Dec 07 '14 at 08:55
1

@MattKo That's because most of the time average running time matters more. But in the case where you have to find *every* solution, then no solution has a best case better than `O(n)` and so saddleback can't be beat for efficiency, and probably not on the constants either. That is why I think that the OP is best off starting with that. As for sorting, many real world implementations these days have switched to http://en.wikipedia.org/wiki/Timsort for better tradeoffs on common use cases. – btilly Dec 07 '14 at 17:28

Nuclearman · Answer 3 · 2014-12-06T16:36:38.147

I'll give this answer in an effort to try to minimize the number of costs calculated. Matt Ko links to a good solution, but it assumes a cheap cost function and a matrix-based data, which you don't seem to have either of. The approach I give requires much closer to O(log N + k) calls to the cost function, where k is the number of points with the desired cost. Note that this algorithm with some performance optimiztions could be made to be O(N) on a 3D matrix with little chance to performance cost function call wise, though it's a fair bit more complicated.

The psudeocode, which is based on techniques used in quickselect looks like this:

While there are still points under considerations:
    Find the ideal pivot point and calculate it's cost
    Remove the pivot from the point set
    If the cost is the desired cost then:
        Add the pivot to the solution set
    Else:
        Separate the points into 3 groups:
             G1. Those that are in in the pivot's octant `VII`
             G2. Those have the same x, y, or z of the pivot
             G3. Those that are not in the pivot's octant `VII`
             # Note this can be done in O(N)

        If all points are in group 2:
            Use 1D binary searches in each dimension to find points with the desired cost
        Else:
            Compute the cost of the pivot
            Keep all points in group 2
            If the pivot cost is greater than desired:
                Keep only the points in group 1
            Else:
                Keep only the points in group 3

The pivot selected based on the points inside and outside of octant VII from that line. Points on the any of the 3 lines that form the octants are dealt with later if needed (G2).

The ideal pivot point is the such that the number of points in group 1 (G1) and group 3 (G3) are as close to equal as possible. To look at it mathematically would be along the lines of maximizing the larger of the two over the smaller of the two, or maximize(max(|G1|,|G3|) / min(|G1|,|G3|) ). Even a fairly naive algorithm looking for the ideal pivot point can find it in O(N^2) (an O(N log N) algorithm likely exists), but it takes O(N^3) to compute the cost of the ideal pivot after it's found.

After the ideal pivot is found and it's cost computed, each iteration should see on average roughly half the remaining points discarded, which again, results in only O(log N + k) calls to the cost function.

Final Note:

In retrospect, I'm not sure special consideration for group 2 is actually required as it's probably in group 3, but I'm not 100% sure. However, separating it out doesn't seem to change the Big O, so I didn't see a need to change it, though doing so would simplify the algorithm slightly.

This is a great answer ! What I don't understand yet is why are you saying that it takes `O(N^3)` to compute cost ? I don't see that anywhere. — Matt Ko, Dec 06 '14 at 18:35
That could be an issue, as I see now you only say "The resolution of 3d object would be like: n * n * n", which I thought was the cost. This actually makes it sound more like a matrix. Seems you didn't actually answer me when I asked for how expensive performance wise the cost function actually is. Without knowing the performance of the cost function I can't say if this approach actually helps reduce the overall cost or not. The performance of this algorithm is thus `O(P + log N + k * C)`, where `P` is pivot finding cost (`O(N^2)` or better) and `C` is the cost function performance (unknown). — Nuclearman, Dec 06 '14 at 18:56
I see what you mean now, and Matt Ko's solution seems more accurate now. However, in this case `N=1000^3` is more accurate as far as Big O goes. Although, if you are using a 3D matrix, then there are a number of nice optimizations you can do. In theory it should be possible to purely mathematically compute the ideal pivot allowing for `P=O(1)` or at worst `P=O(Log N)`, as well using simple math and pointers to do the group separation. Which is somewhat better than the `O(cubedroot(N))` you'd probably get from Matt Ko's answer. — Nuclearman, Dec 06 '14 at 19:07

score 1 · Answer 4 · answered Dec 18 '14 at 15:43

This is not an answer per se, just slightly generalized example C code. (The code was too long to include verbatim.)

The basic implementation is in grid.h (pastebin link).

In it, I've tried to make a distinction between grid coordinates (0 ≤ x, y, z ≤ size-1) and cell coordinates (0 ≤ x, y, z ≤ size-2). In particular, note the span type. Each cell spans a range of values: either interpolated, or the discrete set of the samples at the eight corners of the cell. Because this example uses linear interpolation to determine where within each cell the isosurface intersects the edges or a diagonal, I assume continuous spans.

I didn't realize how important cells spanning values is for edge cases, before I implemented this example code. That is why the OP and I discussed the edge cases in the comments to my other answer, and why the logic outlined in my other answer alone does not handle the edge cases correctly.

Since OP's particular case is not that common/interesting, this example is much more generic (and therefore quite unoptimized for the OP's case). In fact, this example only requires that the function has no local minima or maxima (saddle points and constant regions are allowed); just one minimum and one maximum within the gridded region. Minimum and maximum do not need to be point-like; they can be continous regions.

As such, at grid generation time, we do not know which cells contain the minimum and maximum. (In OP's case, the scalar field is monotonically non-decreasing and limited to the positive octant, so the minimum is at 0,0,0 and maximum at size-1,size-1,size-1.)

To find the minimum and maximum, I implemented two functions, that start from the best corner in the grid (having the smallest or greatest sample value). grid_maximum_cell() walks non-decreasing cells, and grid_minimum_cell() walks non-increasing cells. Since the scalar field is sampled, we implicitly assume it is continuous. As long as there are no local maxima or minima where the walk might stop, the walk will reach the correct cell in relatively few samples. (This search could be optimized much further, though. Consider these two functions just starting points for your own implementation. The OP does not need these at all, of course.)

(Actually, the requirement for the sampled scalar field is that each isosurface is continous, and that all isosurfaces intersect the line drawn from the minimum and maximum cells found using the above two functions.)

The function grid_isosurface() can be used to locate the cells the desired isosurface (field value) passes through. The last parameter is a function pointer. That function is called once for each cell the isosurface passes through. (Note the indexing order for the corner samples, [x][y][z].)

grid_isosurface() locates an initial cell the desired isosurface passes through using a binary search (on the line from the cell containing the minimum sample, to the cell containg the maximum sample). It then traces the surface, using the flood-fill-like algorithm outlined in my answer.

For an example, grid.c (pastebin link) uses the above include file, to evaluate the scalar field

f(x, y, z) = x³ + y³ + z³ + x + y - 0.125·(x·y + x·z + y·z + x·y·z).

On my Linux machine, I compiled and ran the example using

gcc -Wall -std=c99 -Wno-unused -O2 grid.c -o isosurface
./isosurface 50 -1.0 1.0 0.0 > out-0.0
./isosurface 50 -1.0 1.0 0.5 > out-0.5
./isosurface 50 -1.0 1.0 1.0 > out-1.0

and used Gnuplot to plot out the three isosurfaces:

splot "out-0.0" u 1:2:3 notitle w dots, "out-0.5" u 1:2:3 notitle w dots, "out-1.0" u notitle w dots

which leads to this pretty nice point cloud (rotatable in Gnuplot): Three isosurfaces as point clouds

When the grid is initially generated, 14 samples are taken to locate the maximum and minimum cells. Tracing the isosurfaces required additional 18024, 18199, and 16953 samples, respectively; note that much fewer samples are needed for the second and further isosurfaces, if you do them consecutively on the same grid.

The total grid above contains 51×51×51 = 132651 samples, so tracing one isosurface required about 13% of the grid points to be sampled. For a 101×101×101 grid, the samples needed drops down to about 7%; for a 201×201×201 grid, down to 3.5%; for a 501x501x501 grid, to 1.4% (1.7M out of 125.75M samples).

None of this code is optimized for OP's case, nor optimized in general. A sample cache is used to minimize the number of samples needed in general, but the grid_isosurface() isosurface walking function, and the initial grid_minimum_cell() and grid_maximum_cell() functions can be modified to require slightly fewer samples. For larger grids, I don't expect the optimizations to make much of a difference, but for very small grids and very slow functions to evaluate, it might be worthwhile.

If the intent is to generate a polygon mesh for each isosurface, I recommend generating each polygon in the callback function, not from the overall generated point cloud. Using the edge/diagonal intersections like in the above example program, you get all the vertices for the polygon spanning that cell (no caches or such are needed). All you need is to order the edge intersection points correctly.

Questions? Comments? Bug fixes? Suggestions?

find iso-cost points on a 3d grid efficiently with minimum costing of points

4 Answers4

Linked