0

I have an array which contains information of the size and location a series of shapes: where the array is zero, there are no shapes, where the array is not zero there is a shape. Different shapes are separated by zeros - so that if you were to plot every point in the array, you would see a map of the various shapes. I hope that makes sense, if not here is an example array containing 4 different shapes:

np.array([[0, 0, 0, 1, 0, 0, 0],
          [0, 0, 0, 0, 0, 0, 0],
          [1, 1, 0, 0, 1, 0, 0],
          [1, 1, 0, 0, 0, 1, 1],
          [0, 0, 0, 0, 0, 1, 1],
          [3, 5, 2, 0, 0, 0, 0]])

I need to count and identify these shapes but I only want to include the ones with an area above a certain threshold. I would like the area threshold to be 1/15 of the area of the largest shape in the field. (In the above example, the largest area would be 5.

The question is: How can I find (using python) the area of the maximum shape in the field without individually identifying each shape?

Edit

To clarify what I mean by the 'shapes', the following code plots an image of the array, which shows 4 distinct objects:

import numpy as np
import matplotlib.pyplot as plt

a = np.array([[0, 0, 0, 1, 0, 0, 0],
              [0, 0, 0, 0, 0, 0, 0],
              [1, 1, 0, 0, 1, 0, 0],
              [1, 1, 0, 0, 0, 1, 1],
              [0, 0, 0, 0, 0, 1, 1],
              [1, 1, 1, 0, 0, 0, 0]])
ind = np.nonzero(arr)
x = ind[0]
y = ind[1]
plt.imshow(arr)
plt.show()
ali_m
  • 71,714
  • 23
  • 223
  • 298
heliqua
  • 47
  • 2
  • 6

1 Answers1

1

You can use scipy.ndimage.label to find the connected non-zero regions in your array, then use scipy.ndimage.sum to find the area of each region:

from scipy import ndimage

labels, nshapes = ndimage.label(a)
areas = ndimage.sum(a, labels=labels, index=range(1, nshapes))

idx = np.argmax(areas)
biggest_shape = labels == (idx + 1)

In your example there happen to be two 'shapes' with the same area:

from matplotlib import pyplot as plt

fig, (ax1, ax2, ax3) = plt.subplots(1, 3)

ax1.imshow(a, cmap=plt.cm.jet)
ax2.imshow(labels, cmap=plt.cm.jet)
ax3.imshow(biggest_shape, cmap=plt.cm.jet)

enter image description here

Update

The structure argument passed to scipy.ndimage.label determines which neighbouring elements are considered to be 'connected' (see the docs linked above). If you want diagonally adjacent elements to be considered as connected, you can pass a 3x3 array of ones:

labels, nshapes = ndimage.label(a, structure=np.ones((3, 3)))

enter image description here

ali_m
  • 71,714
  • 23
  • 223
  • 298
  • Thank you, this is really useful. Is there a way I could adapt it to include the diagonals? I want the biggest shape to be the 'orange and yellow' shape combined with area 5. – heliqua Aug 11 '14 at 07:45