1

I have some high dimensional boolean data, in this example an array with 4 dimensions, but this is arbitrary:

X.shape
 (3, 2, 66, 241)

I want to group the dataset into connected regions of True values, which can be done with scipy.ndimage.label, with the aid of a connectivity structure which says which points in the array should be considered to touch. The default 2-D structure is a cross:

[[0,1,0],
 [1,1,1],
 [0,1,0]]

Which can be easily extended to high dimensions if all those dimensions are connected. However I want to programmatically generate such a structure where I have a list of which dims are connected to which:

#We want to find connections across dims 2 and 3 across each slice of dims 0 and 1:
dim_connections=[[0],[1],[2,3]]

#Now we want two separate connected subspaces in our data:
dim_connections=[[0,1],[2,3]]

For individual cases I can work out with hard-thinking how to generate the correct structuring element, but I am struggling to work out the general rule! For clarity I want something like:

mystructure=construct_arbitrary_structure(ndim, dim_connections)
the_correct_result=scipy.ndimage.label(X,structure=my_structure)
Cris Luengo
  • 55,762
  • 10
  • 62
  • 120
JoshD
  • 47
  • 1
  • 10
  • 2
    I don't think the second example, `dim_connections=[[0,1],[2,3]]` is possible in one call to `scipy.ndimage.label`. By analogy, say you wanted to find all vertical connections and all horizontal connections, without finding components which are connected by both vertical and horizontal connections. I don't think there's any structure that lets you do that. – Nick ODell Nov 24 '22 at 17:24
  • Thanks, yes I can see that! But it could be handled by two calls to `label` I think? If I can find the way to generate the structure when simply excluding certain axes. – JoshD Nov 25 '22 at 17:11
  • *“For individual cases I can work out with hard-thinking how to generate the correct structure tensor.”* What is the expected output for the example then? I just don’t see how this could be possible, or meaningful. As Nick said, you might be able to do this in two separate labeling steps, but then what? How would you combine that information? What is the meaning? – Cris Luengo Nov 28 '22 at 15:06
  • Perhaps as you suggest the second example is not well defined... – JoshD Nov 30 '22 at 20:10

1 Answers1

1

This should work for you


def construct_arbitrary_structure(ndim, dim_connections):
    #Create structure array
    structure = np.zeros([3] * ndim, dtype=int)

    #Fill structure array
    for d in dim_connections:
        if len(d) > 1:
            # Set the connection between multiple dimensions
            for i in range(ndim):
                # Create a unit vector
                u = np.zeros(ndim, dtype=int)
                u[i] = 1

                # Create a mask by adding the connection between multiple dimensions
                M = np.zeros([3] * ndim, dtype=int)
                for j in d:
                    M += np.roll(u, j)
                structure += M
        else:
            # Set the connection for one dimension
            u = np.zeros(ndim, dtype=int)
            u[d[0]] = 1
            structure += u

    #Make sure it's symmetric
    for i in range(ndim):
        structure += np.roll(structure, 1, axis=i)

    return structure
DotNetRussell
  • 9,716
  • 10
  • 56
  • 111