2

I have a 3D numpy array which represents a mask. The "1" elements of this mask are areas where no calculation is done (blue areas in the figure below). The "0" elements of the mask are areas for which a calculation is made.

Each calculation is imperatively realized on a cuboid block of minimal edges Nmin. So I try to find the best distribution of cuboids to fill the "0" regions of this mask.

Here is an example of mask : mask.npy

And its 3D representation :

mask = np.load('mask.npy')

fig = plt.figure()
ax1 = fig.add_subplot(projection='3d')
ax1.voxels(mask)
plt.show()

Text

It is a packing problem. The goal is to pack various cuboids into a larger cuboid with some constraints (the blue areas). There are a lot of possible combinations.

The way I see it to solve this problem is to determine first the largest "0" area and then repeat this process until the mask is completely filled.

I already did this kind of thing in 2D (see fdgrid), but I'm stuck on this 3D problem. how can I find the largest block (cuboid) of '0' ?

EDIT 1:

I have a draft solution that is a bit brutal and very slow with this 50x50x50 grid.

import time


def cuboid_list(mask, Nmin=3):
    """ List of all cuboids than could fit in the domain. 
        List is formated as (volume, (Lx, Ly, Lz)). """
    size = mask[mask==0].size
    cuboids = [( i*j*k, (i, j, k)) for i, j, k in itertools.product(range(Nmin, nx), range(Nmin, ny), range(Nmin, nz)) if i*j*k <= size]
    cuboids.sort(reverse=True)
    return cuboids

def search_biggest(mask, cuboids):
    """Search biggest cuboid. """

    t0 = time.perf_counter()
    tested = 0
    origin, size = None, None
    for _, (Lx, Ly, Lz) in cuboids:
        for o in itertools.product(range(nx-Lx), range(ny-Ly), range(nz-Lz)):
            tested += 1
            if (mask[o[0]:o[0]+Lx, o[1]:o[1]+Ly, o[2]:o[2]+Lz] == 0).all():
                origin, size = o, (Lx, Ly, Lz)
                break
        if origin:
            print(f'{tested} configurations tested in {time.perf_counter() - t0:.3f} s.')
            break
    
    return origin, size


# Load mask
mask = np.load('mask.npy')

# List of all cuboids
cuboids = cuboid_list(mask)

# Search biggest
sub = search_biggest(mask, cuboids)

In this particular case, this leads to 2795610 configurations tested in 164.984 s ! The largest cuboid has origin=(16, 16, 0) and size=(33, 33, 49).

To make the algorithm usable for larger geometries (e.g. 1024x1024x512 grid), I need to speed up this process.

EDIT 2:

def search_biggest(mask, cuboids):
    """Search biggest cuboid. """
    
    # List of nonzero indexes
    forbid = set([tuple(i) for i in np.argwhere(mask != 0)])

    t0 = time.perf_counter()
    tested = 0
    origin, size = None, None
    for _, (Lx, Ly, Lz) in cuboids:
        # remove cuboid with origin in another cuboid and iterate
        for o in set(itertools.product(range(nx-Lx), range(ny-Ly), range(nz-Lz))).difference(forbid):
            tested += 1
            if (mask[o[0]:o[0]+Lx, o[1]:o[1]+Ly, o[2]:o[2]+Lz] == 0).all():
                origin, size = o, (Lx, Ly, Lz)
                break
        if origin:
            print(f'{tested} configurations tested in {time.perf_counter() - t0:.3f} s.')
            break
    
    return origin, size

Reducing the possibilities leads to 32806 configurations tested in ~2.6s.

Any ideas ?

EDIT 3:

def search_biggest(mask, cuboids):
    """Search biggest cuboid. """
    
    # List of nonzero indexes
    forbid = set([tuple(i) for i in np.argwhere(mask != 0)])

    t0 = time.perf_counter()
    tested = 0
    origin, size = None, None
    for _, (Lx, Ly, Lz) in cuboids:
        tested += 1
        tmp = [i for i in itertools.product(range(nx-Lx), range(ny-Ly), range(nz-Lz)) if i not in forbid]
        tmp = [i for i in tmp if 
               all(x not in forbid for x in itertools.product(range(i[0], i[0]+Lx), range(i[1], i[1]+Ly), range(i[2], i[2]+Lz)))]

        if tmp:
            origin, size = tmp[0], (Lx, Ly, Lz)
            print(f'{tested} configurations tested in {time.perf_counter() - t0:.3f} s.')
            break
            
    return origin, size

Simple is better than complex ! Alternative implementation of search_biggest leads to 5991 configurations tested in 1.017 s. !

Maybe, I can reduce the list of cuboids to speed up things, but I think this is not the way to go. I tried with a 500x500x500 grid and stopped the script after an hour without a result...

I have two other possible formulations in mind:

  • Fill the cuboid with elementary cuboids of size Nmin x Nmin x Nmin then merge the cuboids having common faces. The problem is that in some situations, it is not possible to fill an empty region only with cuboids of this size.

  • Focus the search for empty rectangles on a plane (x, y, z=0) then extend them along z until a constraint is encountered. Repeat this operation for z $\in$ [1, nz-1] then merge cuboids that can be merged. The problem is that I can end up with some empty regions smaller than an elementary cuboid.

Always in search for a better way to solve this problem...

Ipse Lium
  • 970
  • 1
  • 6
  • 19
  • You can speed up the check a lot using the (3D) [summed area table](https://en.wikipedia.org/wiki/Summed-area_table) method. With this you can basically get rid of the 3rd itertools.product. But the search itself is still a bruteforce method and scales badly for larger grids. I don't know if there's a better algorithm. – user7138814 Apr 26 '22 at 12:29
  • Are the constraint regions consistently cuboidal? If they are, you can join those regions into a single block, and keep track of the true volume separately. In the example mask, this reduces the problem size from 50x50x50 to 4x4x5, which is solvable in a couple milliseconds. – yut23 Apr 28 '22 at 19:41
  • Yes, the constraint regions are always a set of cuboids. But I don't see how to reduce the problem by joining those region to a single block. – Ipse Lium Apr 28 '22 at 19:50

1 Answers1

1

Given that the constraints are cuboids, we can simplify checking the possibilities by skipping the rows/columns/layers where nothing changes. For example, a 2D mask might look like this, with the redundant rows/columns marked:

   v v
 ...11.
>...11.
 1.....
>1.....
 ......

which can be reduced to

..1.
1...
....

We just need to keep track of the true lengths of each reduced cell, and use that to calculate the real volume. However, this means we need to take a different approach to finding the largest cuboid; we can't (easily) pre-generate them all and try in order of volume, since the true sizes vary depending on the point we're testing from. The code below just tries all possible cuboids and keeps track of the largest one, and seems plenty fast.

There are also some off-by-one errors in the original code, which I've avoided or fixed: range(Nmin, nx) -> range(Nmin, nx+1) in cuboid_list(), and range(nx-Lx) -> range(nx-Lx+1) in search_biggest(). The largest cuboid actually has size=(34, 34, 50).

import itertools
import time
import numpy as np

class ReducedSolver:
    def __init__(self, mask):
        self.splits = tuple(
            np.unique(
                # find the edges of the constraint regions in each axis
                [0, mask.shape[ax], *np.where(np.diff(mask, axis=ax, prepend=0))[ax]]
            )
            for ax in range(3)
        )

        # extract exactly one value (the lowest corner) for each reduced region
        self.mask = np.zeros([len(split) - 1 for split in self.splits], dtype=mask.dtype)
        for i, x in enumerate(self.splits[0][:-1]):
            for j, y in enumerate(self.splits[1][:-1]):
                for k, z in enumerate(self.splits[2][:-1]):
                    self.mask[i, j, k] = mask[x, y, z]
        # less readable:
        # self.mask = mask[np.ix_(self.splits[0][:-1], self.splits[1][:-1], self.splits[2][:-1])]

        # true sizes of each region
        self.sizes = [np.diff(split) for split in self.splits]

    def solve(self):
        """Return list of cuboids in the format (origin, size), using a greedy approach."""
        nx, ny, nz = self.mask.shape
        mask = self.mask.copy()

        result = []
        while np.any(mask == 0):
            t0 = time.perf_counter()
            tested = 0
            max_volume = 0
            # first corner of the cuboid
            for x1, y1, z1 in itertools.product(range(nx), range(ny), range(nz)):
                if mask[x1, y1, z1]:
                    continue
                # opposite corner of the cuboid
                for x2, y2, z2 in itertools.product(range(x1, nx), range(y1, ny), range(z1, nz)):
                    tested += 1
                    # slices are exclusive on the the end point
                    slc = (slice(x1, x2 + 1), slice(y1, y2 + 1), slice(z1, z2 + 1))
                    # np.any doesn't short-circuit
                    if any(np.nditer(mask[slc])):
                        continue
                    true_size = tuple(np.sum(size[s]) for size, s in zip(self.sizes, slc))
                    volume = np.prod(true_size)
                    if volume > max_volume:
                        max_volume = volume
                        origin = (
                            self.splits[0][x1],
                            self.splits[1][y1],
                            self.splits[2][z1],
                        )
                        # keep track of the region in the reduced mask with `slc`
                        biggest = ((origin, true_size), slc)
            print(f"{tested} configurations tested in {(time.perf_counter() - t0)*1000:.2f} ms.")
            result.append(biggest[0])
            # mark this cuboid as filled
            mask[biggest[1]] = 1
        return result

# Load mask
mask = np.load("mask.npy")

# Solve on the simplified grid
reduced = ReducedSolver(mask)
sol = reduced.solve()

On my machine, finding the biggest cuboid on the example mask takes ~1s with search_biggest(), but is ~4ms with my code, and the entire solve takes ~20ms for 16 cuboids.

yut23
  • 2,624
  • 10
  • 18
  • Seems like a nice solution to me. On the other hand, I need the cuboids to have a minimum size (Nmin x Nmin x Nmin) and a mesh point can only belong to a single cuboid (hence the absence of +1 in the ranges). – Ipse Lium May 02 '22 at 23:30
  • Ah, I see. The latter should be pretty easy to satisfy, but the minimum size constraint may need a smarter algorithm than just choosing the biggest volume. Maybe try to fill in small (width < Nmin) corners/edges first, then you can use the greedy algorithm? – yut23 May 03 '22 at 19:12