3

let's say I have some numpy array (in this case it represents a 100x100 binary image)...

img=np.random.randint(0,2,(100,100)).astype(numpy.uint8)

How best to determine the "average position" of the 1 values in the array? For instance, if therer was a cluster of 1's in the array, I would like to find the center of that cluster.

Chris
  • 9,603
  • 15
  • 46
  • 67

1 Answers1

1

I'm seeing you tagged this as numpy too, so I'd do this:

x = range(0, img.shape[0])
y = range(0, img.shape[1])

(X,Y) = np.meshgrid(x,y)

x_coord = (X*img).sum() / img.sum().astype("float")
y_coord = (Y*img).sum() / img.sum().astype("float")

That wold give you the weighted average center.

If you want this for every cluster of 1's in the image I suggest you use connected components to mask which cluster you're interested in. Might not be a good idea to repeat this process for as many clusters as you want, but rather compute all cluster averages in the same array traversal.

Diana
  • 1,301
  • 1
  • 9
  • 21
  • 5
    [`scipy.ndimage`](http://docs.scipy.org/doc/scipy/reference/ndimage.html) has a `center_of_mass` function. It also has `label` to find clusters. – Jaime Jun 30 '13 at 04:04
  • @Jaime: Oh, that's probably also faster than my solution. I'll take a look. Thanks for the tip! – Diana Jun 30 '13 at 05:11
  • 1
    Well, if you set `sparse=True` in your call to `np.meshgrid` and store `img.sum()` in an auxiliary variable, to reuse it instead of calculating it twice, I don't think your code is going to perform any worse than scipy's. I don't think you need the `.astype(float)` either, because you are dividing numpy scalars, not Python ints, but I may be wrong... – Jaime Jun 30 '13 at 14:09